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encoding SID® polypeptides which bind selectively to a 
polypeptide encoded by a pathogenic strain of the hep- 
atitis C virus, as well as to the SID® polypeptides which 
are encoded by said nucleic acids. 
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nucleic acid encoding a SID® polypeptide as well as 
hqsl cells transformed with such vectors. 
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SID® polypeptide selected from a pathogenic strain of 
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polypeptide and a polypeptide which specifically binds 
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Description 

FIELD OF THE INVENTION 

5 [0001] The present invention relates to nucleic acids encoding SID® polypeptides which bind selectively to a polypep- 
tide encoded by a pathogenic strain of the hepatitis C virus, as well as to the SID® polypeptides which are encoded 
by said nucleic acids. 

[0002] The invention also concerns vectors comprising a nucleic acid encoding a SID® polypeptide as well as host 
cells transformed with such vectors. 
10 [0003] The invention is also directed to two-hybrid methods which make use of the nucleic acids encoding a SID® 
polypeptide selected from a pathogenic strain of the hepatitis C virus as welt as to methods for selecting molecules 
which inhibit the binding between a SID® polypeptide and a polypeptide which specifically binds thereto. 
[0004] The invention also pertains to marker compounds containing a SID® polypeptide as well as nucleic acids 
encoding such marker compounds and methods and kits using the same. 

15 

BACKGROUND OF THE INVENTION 

[0005] The hepatitis C virus (HCV) causes several liver diseases, including liver cancer. The HCV genome is a plus- 
stranded RNA that encodes the single polyprotein processed into at least 10 mature polypeptides. 

20 [0006] The structural proteins are located in the amino terminal quarter of the polyprotein, and the non-structural 
(NS) polypeptides in the remainder (for a review, see HOUGHTON, 1996). The genome organisation resembles that 
of fl a vi viruses and pesti viruses and HCV is now considered to be a member of the flaviviridae family. 
[0007] The gene products of HCV are, from the N-terminus to the C-terminus: core (p22), E1 (gp35), E2 (gp70), NS2 
(p21), NS3 (p70), NS4a (p4), NS4b(p27), NS5a (p58), NS5b (p66), as disclosed in figure 1. Core, E1 and E2 are the 

25 structural proteins of the virus processed by the host signal peptidase(s). The core protein and the genomic RNA 
constitute the internal viral core and E1 and E2 together with lipid membrane constitute the viral envelop (DUBUISSON 
et al., 1994; GRAKOUI et al., 1993; HIGIKATA et al. , 1993.). 

[0008] The NS proteins are processed by the viral protein NS3 which has two functional domains: one (Cro-1), 
encompassing the NS2 region and the N-terminal portion of NS3, which cleaves autocatalytically between NS2 and 
30 NS3, and the other (Cro-2), located solely in the N-terminal portion of NS3, cleaves the other sites downstream NS3 
(BARTENSCHLAGER et al; 1995; HIGIKATA et a!;, 1993). 

[0009] Various HCV protein-protein interactions have already been identified, notably by two hybrid methods. No- 
ticeably, FLAJOLET et al; (2000) have shown interactions between NS3 and NS4A proteins as well as between NS4A 
and NS2 proteins. These authors have also shown core-core, NS3-E2, NS5A-E1, NS4A-NS3 and NS4A-NS2 interac- 
35 tions. Covalent as well as non-covalent interactions between E1 and E2 have been shown by PATEL et at; (1999). The 
protein interactions between NS3 and the HCV RNA helicase have also been described (MIN et al; 1999; GALLINARI 
et al., 1 999) as well as interaction between NS3 and NS4A (URBANI et al., 1 999; Dl MARCO et al., 2000; BUTKIEWICZ 
et al., 2000). 

[0010] However, the prior art methods allow the determination of interactions between full length proteins or large 
40 domains of proteins encoded by the genome of the hepatitis C virus which may contain more than one region of 

interaction with one or several HCV proteins. BUTKIEWICZ et al. (2000) discloses the interaction between the NS3 

protease and a small peptide derived from NS4A. However, BUTKIEWICZ et al. (2000) discloses exclusively in vitro 

assays for interactions between the small peptides derived from NS4A and the NS3 protease from HCV which may 

not be of physiological relevance. 
45 [0011] There is a need in the art for polypeptides that contain the minimal aminoacid sequence that is able to bind 

specifically with a naturally-occurring HCV protein in physiological conditions in order to design new tools for therapeutic 

and detection purposes related to HCV. 

SUMMARY OF THE INVENTION 

50 

[0012] This invention provides nucleic acids encoding polypeptides, which are termed SID® polypeptides, wherein 
these polypeptides are the final products of a double selection method involving a first step of selection of HCV-derived 
polynucleotides through a two-hybrid system and a second selection step involving an alignment between the different 
polynucleotides selected at the first step. 
55 [0013] The invention also pertains to the SID® polypeptides encoded by the SID® nucleic acids. 

[0014] Another object of the invention are recombinant vectors containing a SID® nucleic acid as defined above as 
well as host cells transformed with such vectors or nucleic acids. 

[0015] A further object of the invention consists of two-hybrid methods which make use of these SID® nucleic acids 
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as well as to methods for selecting molecules which inhibit the binding between a SID® polypeptide and a polypeptide 
that binds specifically thereto, as well as kits for performing these methods. 

[0016] It is still a further object of the invention to provide for marker compounds which comprise a SID® polypeptide 
or which are encoded by a polynucleotide containing a SID® nucleic acid as defined above, as well as to methods and 

5 kits which make use of these marker compounds. 

[0017] This invention also relates to pharmaceutical compositions as well as to methods for preventing or curing a 
HCV viral infection in a human or an animal that use a SID® polypeptide or a SID® nucleic acid as disclosed herein. 
[0018] Throughout this application, various publications, patents and published patent applications are cited. The 
disclosures of these publications, patents and published patent specifications, referenced in this application are hereby 

10 incorporated by reference into the present disclosure to more fully describe the state of the art to which this invention 
pertains. 

BRIEF DESCRIPTION OF THE FIGURES. 

15 [0019] Figure 1 consists of a general overview of HCV genome and its encoded polyprotein. The RNA coding strand 
is represented with a line for untranslated regions (NCR) and boxes for coding regions. 

[0020] Positions and enzymes responsible for cleavage are indicated above. p7 is a secondary cleavage product of 
E2 (adapted from HOUGHTON, 1996). 

[0021] Fig. 2 is a restriction map of the plasmid pAS2AA which may be used for producing a recombinant" Selected 
20 Interacting Domain (SID®)" polypeptide or a recombinant marker compound of the invention. 

[0022] Fig. 3 is a restriction map of the plasmid pACTII which may be used for producing a recombinant n Selected 
Interacting Domain (SID®)". 

[0023] Fig. 4 is a restriction map of the plasmid pUT18 which may be used for producing a recombinant " Selected 
Interacting Domain (SID®)". 

25 [0024] Fig. 5 is a restriction map of the plasmid pUT18C which may be used for producing a recombinant " Selected 
Interacting Domain (SID®)". 

[0025] Fig. 6 is a restriction map of the plasmid pT25 which may be used for producing a recombinant " Selected 
Interacting Domain (SID®)". 

[0026] Fig. 7 is a restriction map of the plasmid pKT25 which may be used for producing a recombinant " Selected 

30 Interacting Domain (SID®)". 

[0027] Fig. 8 is an illustration of the first step of selecting a SID® nucleic acid of the invention, wherein it is performed 
a selection of different sets of overlapping nucleic acids primarily selected through a two-hybrid method, in order to 
define pre-SID nucleic acids. Three fragments frg1, frg2 and frg3 of lengths 11, 12 and 13 respectively. Fragment 11 and 
12 are clustered together if the length of intersection, I, is greater than 30% of 11 and 12. Fragment frg3 is grouped with 

35 fragments frgl and frg2 if the length of intersection between frg1 and frg3, I', is greater than 30% of 11 and 13 and if 
the length of intersection between frg 2 and frg 3, I », is greater than 30% of 12 and 13. 

[0028] Fig.9 illustrates the selection of pre-SID® nucleic acid from a particular set of overlapping nucleic acids pre- 
viously selected through a two-hybrid method. The pre-SID® is defined as the intersection of all the fragments (frg1-6) 
in a cluster. 

40 [0029] Fig.10 illustrates the selection of a SID® nucleic acid from the overlapping regions between two pre-SID 
nucleic acids. A SID® is defined if the length of overlap between two pre-SID®s, I, is greater than 30 bp. Further SID®s 
are defined by non-overlapping areas if their length (I*) represents more than 30% of the length of one of the fragments 
which contributes to the corresponding pre-SID® (frg1-6). 

[0030] Fig. 11 illustrates a further step of determining SID® nucleic acids after alignment of two overlapping SID 
45 nucleic acids identified according to figure 10. Fragments frgV and frg2* contribute to both SID®1 and SID®2 (top 

panel). For each SID®, the number of fragments are counted and fragments are assigned to the SID® with the most 

fragments. The remaining fragments are re-analysed and a new SID® is defined as the region of intersection of these 

fragments (bottom panel, SID®2' - fragment 3' and fragment 4'. 

[0031] Fig. 12 illustrates a map of the vector pB5 which may be used in example 1 . 
so [0032] Fig. 13 illustrates a map of the vector pP6 which may be used in example 1 . 

DETAILED DESCRIPTION OF THE INVENTION 

[0033] The present invention firstly provides for nucleic acids encoding SID® polypeptides. 
55 [0034] As generally used herein, a « bait » nucleic acid encodes a « bait » polypeptide. A polypeptide is termed a 
« bait » polypeptide when this polypeptide is used to select a formerly unknown « prey » nucleic acid encoding a 
« prey » polypeptide which binds selectively with said « bait » polypeptide. Indeed, a « prey » nucleic acid which has 
been selected for binding to a given bait polypeptide may be used in another selection method or in another round of 
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the same selection method as a « bait » nucleic acid encoding a « bait » polypeptide for the purpose of selection of 
new prey nucleic acids, encoding prey polypeptides which bind selectively with said bait polypeptide, it being under- 
stood that the nucleic acid encoding said bait polypeptide was formerly selected from a population of prey nucleic acids. 

5 SELECTED INTERACTING DOMAIN (SID©) POLYPEPTIDES AND METHODS FOR THEIR PREPARATION. 

[0035] A selected interacting domain polypeptide that binds specifically to a polypeptide of interest is the result of a 
two-step screening procedure, wherein : 

10 1 ) the first step consists of selecting and characterizing a collection of nucleic acids (prey nucleic acids) encoding 

polypeptides which bind specifically to a given bait polypeptide of interest; and 

2) the second step of the two-step procedure consists of determining the nucleic acid sequences which encode 
for SID® polypeptides after having generated sets of polynucleotides from the collection of nucleic acids selected 
at step 1 ). 

15 

[0036] As a result of the original two-step screening procedure disclosed hereunder, every nucleic acid finally selected 
encodes a « Selected Interacting Domain (SID®) " polypeptide which binds with a high specificity with the bait polypep- 
tide of interest. 

20 Step 1) Selecting prey nucleic acids 

[0037] The first step of selecting a collection of nucleic acids encoding polypeptides which binds specifically to the 
bait polypeptide is carried out through a yeast two-hybrid system. The yeast two-hybrid system is designed to study 
protein-protein interactions in vivo, and relies upon the fusion of a bait protein to the DNA binding domain of the yeast 
25 Gal4 protein. 

[0038] According to the present invention, the first step of the procedure for selecting a Selected Interacting Domain 
(SID®) polynucleotide encoding a Selected Interacting Domain (SID®) polypeptide consists of the two-hybrid screening 
system described by Fromont-Racine et al. (1997) or the method described by FLAJOLET et al. (2000). The yeast 
two-hybrid system utilizes hybrid proteins to detect protein-protein interactions by means of direct activation of a reporter 
30 gene expression. In essence, the nucleic acids encoding the two putative protein partners, the bait polypeptide of 
interest and the prey polypeptide, are genetically fused to the DNA-binding domain of a transcription factor and to a 
transcriptional activation domain, respectively. 

Construction of the prey HCV nucleic acids library. 

35 

[0039] Then, a genomic DNA library prepared from the genome of the pathogenic H77 strain of HCV (Yanagi et al., 
1997), is constructed in the specially designed vector pP6 shown in figure 13 after ligation to suitable linkers, such that 
every genomic DNA insert is fused to a nucleotide sequence in the vector that encodes the transcription of domain of 
the Gal4 protein. 

40 [0040] The polypeptides encoded by the nucleotide inserts of the genomic DNA library thus prepared are termed 
B prey rt polypeptides in the context of the presently described selection method of prey nucleic acids. 

Construction of the bait nucleic acids library 

- 45 [0041] The DNA fragments obtained after nebulization of the HCV genomic DNA are also inserted in plasmid pB5 
shown in figure 12 wherein these DNA inserts are fused to a polynucleotide encoding the DNA binding domain of the 
Gal4 protein and the recombinant vectors are used to transform E. coii cells. The transformed E. coii cells are grown 
and plasmid DNA is extracted and sequenced. 

[0042] These plasmids which code in frame fusion proteins are used as bait plasmids. Bait plasmids thus consist of 
50 a collection of recombinant pB5 plasmids each containing inserted therein a DNA fragment from the H77 strain HCV 
genome encoding a polypeptide consisting of all or part of a HCV protein or alternatively a polypeptide consisting of 
all or part of two HCV proteins encoded by contiguous nucleic acid sequences of the HCV genome. 
[0043] The selected HCV bait nucleic acids of the invention are referred to as the nucleotide sequences SEQ ID 
N°114to 150. 

55 [0044] The selected HCV bait polypeptides encoded by the nucleic sequences SEQ ID N°114 to 150 consist respec- 
tively of the aminoacid sequences SEQ ID N°77 to 113. 

[0045] Detectable marker genes are already present within the chromosomic yeast DNA and consist respectively of 
the His3 and LacZ genes, such as described by FROMONT-RACINE et al. (1997) or FLAJOLET et al. (2000). 
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[0046] Then, the collection of nucleic acid inserts contained in the collection of E. Coli cell clones containing the 
genomic DNA or HCV DNA library previously prepared are used to transform a first yeast strain, namely the Y187 
Saccharomyces cerevisiae strain (phenotype:MATot, Gal4A, gal80A, ade2-101, His3, Leu2-3, -112 Trp1-901, Ura3-52, 
URA3::UASGAL1-LacZ Met). 

5 [0047] The nucleic acid encoding the bait polypeptide of interest is inserted in the appropriate vector, said vector 
being used to transform a second yeast strain which may be the CG1945 (MATa Ga!4-542 Gal180-538, Ade2-101, 
His3*200, Leu2-3, -112 Trp1-901 Ura3-52, Lys2-801, URA3::GAL4 17Mers (X3)-CyC1TATA-LacZ LYS2::GAL1 
UAS-GAL1TATA-His3 CYH R ). 

[0048] Then , the two yeast strains are mated to obtain a collection of mated cells. 
10 [0049] The clones derived from the collection of mated cells above which are positive in an X-Gal overlay assay are 
those for which an interaction between the recombinant bait polypeptide and a polypeptide encoded by a nucleic acid 
insert originating from the HCV genomic library has occurred. 

[0050] The clones derived from the collection of mated cells above may also be selected in the presence of histidine, 
and the positive clones are those for which an interaction between the recombinant bait polypeptide and a polypeptide 
15 encoded by a nucleic acid insert originating from the HCV genomic library has occurred. 

[0051] In a further step, the prey nucleic acid Inserts contained in the positively selected clones are amplified and 
sequenced. 

Step 2: determination of the nucleic acid sequences encoding a Selected Interacting Domain (SID®) polypeptide 
20 which binds specifically to a bait polypeptide of interest. 

[0052] This is the second step of the two step procedure defined above, which allows the precise selection of nucleic 
acids encoding the SID® nucleic acids of the present invention which are derived from the H77 strain HCV genome. 
[0053] The SID® nucleic acid selection procedure, which is disclosed hereunder, has been specifically designed for 
25 the HCV genome which encodes for a single polyprotein and which thus comprises contiguous Open Reading Frames, 
said polyprotein being further processed to produce at least 10 mature structural and non-structural viral proteins. 
[0054] Thus, the second selection step of the two-step procedure consists of a method for determining a polynucle- 
otide encoding a Selected Interacting Domain (SID®) of a prey polypeptide of interest derived from HCV, which prey 
polypeptide interacts with a bait polypeptide, wherein said method comprises the steps of: 

30 

a) selecting, from the collection of prey polynucleotides obtained at the end of the first step of the two-step procedure 
described herein, all prey polynucleotides encoding a prey polypeptide capable of interacting with said bait polypep- 
tide and containing a common nucleic acid fragment; 

b) aligning the nucleotide sequences of the prey polynucleotides selected at step a) and gathering in one set or 
35 in a plurality of sets of sequences those nucleotide sequences which have sequences that overlap for more than 

30% of their respective nucleic acid length, wherein each common overlapping nucleotide sequence in one set of 
sequences defines a sequence encoding a pre-SID® polypeptide (see Figures 8 and 9); and 

c) aligning two sequences encoding two respective pre-SID® polypeptides (see Figure 10), and : 

40 i) defining an overlapping nucleic acid sequence between the sequences encoding the two respective 

pre-SID® polypeptides as a sequence encoding a SID® polypeptide, provided that the overlapping sequence 
is of at least 30 nucleotides in length; 

ii) defining a non-overlapping nucleic acid sequence between the sequences encoding the two respective 
pre-SID® polypeptides as a sequence encoding a SID® polypeptide, provided that (1) said non-overlapping 
45 sequence has more than 30 nucleotides in length and (2) said non-overlapping sequence represents at least 

30% in length of any one of the polynucleotides contained in the set of prey polynucleotides used for defining 
the sequence encoding each pre-SID® polypeptide. 
This method may further comprise the steps of: 

50 d) counting the number of overlapping prey polynucleotides contained in a first set of polynucleotides defining a 

sequence encoding a first SID® polypeptide; 

e) counting the number of overlapping prey polynucleotides contained in a second set of polynucleotides defining 
a sequence encoding a second SID® polypeptide which overlaps with the sequence encoding the first SID® 
polypeptide; 

55 f) determining which sequence among those encoding respectively the first SID® polypeptide and the second 

SID® polypeptide has been defined with the largest number of prey polynucleotides and selecting this set of prey 
sequences. 

g) adding to the set of prey sequences selected at step f) those sequences that were contained in the set of prey 
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sequences used for defining the sequence encoding the SID® polypeptide with the smallest number of prey se- 
quences and which overlap with the sequence encoding the SID® polypeptide with the largest number of prey 
sequences.; 

h) aligning the prey sequences added at step g) with the sequences already contained in the set of prey sequences 
5 which defined the sequence encoding the SID® polypeptide with the largest number of prey sequences; 

i) defining an overlapping sequence between the whole sequences which were aligned in step h), wherein said 
overlapping sequence consists of a sequence encoding a SID® polypeptide. (See Figure 11). 

[0055] The method for selecting a SID® nucleic acid encoding a SID® polypeptide is an object of the present inven- 
10 tion, as well as any SID® nucleic acid or any SID® polypeptide which may be obtained by this selection method. 

SID® nucleic acids of the invention 

[0056] The SID® nucleic acids selected as described above starting from the genome of the H77 strain of HCV are 
15 the nucleic acid sequences of SEQ ID N°39 to 76 which encode the SID® polypeptides of SEQ ID N°1 to 38. 

[0057] A first object of the invention consists of a nucleic acid which encodes a polypeptide selected from the group 

consisting of the aminoacid sequences SEQ ID N°1 to 38 or a variant thereof, and a sequence complementary thereto. 

[0058] For the purposes of the present invention, a first polynucleotide is considered as being « complementary » 

to a second polynucleotide when each base of the first polynucleotide is paired with the complementary base of the 
20 second polynucleotide whose orientation is reversed. The complementary bases are A and T(or A and U), or C and G. 

[0059] Preferably, any one of the nucleic acid or the polypeptides encompassed by the invention is under a purified 

or an isolated form. 

[0060] The term "isolated" for the purposes of the present invention designates a biological material (nucleic acid or 
protein) which has been removed from its original environment (the environment in which it is naturally present). 
25 [0061] For example, a polynucleotide present in the natural state in a plant or an animal is not isolated. The same 
polynucleotide separated from the adjacent nucleic acids in which it is naturally inserted in the genome of the plant or 
animal is considered as being "isolated". 

[0062] Such a polynucleotide may be included in a vector and/or such a polynucleotide may be included in a com- 
position and remains nevertheless in the isolated state because of the fact that the vector or the composition does not 
30 constitute its natural environment. 

[0063] The term "purified" does not require the material to be present in a form exhibiting absolute purity, exclusive 
of the presence of other compounds. It is rather a relative definition. 

[0064] A polynucleotide is in the "purified" state after purification of the starting material or of the natural material by 
at least one order of magnitude, preferably 2 or 3 and preferably 4 or 5 orders of magnitude. 
35 [0065] "Isolated polypeptide" or "isolated protein" is a polypeptide or protein which is substantially free of those 
compounds that are normally associated therewith in its natural state (e.g., other proteins or polypeptides, nucleic 
acids, carbohydrates, lipids). "Isolated" is not meant to exclude artificial or synthetic mixtures with other compounds, 
or the presence of impurities which do not interfere with biological activity, and which may be present, for example, 
due to incomplete purification, addition of stabilisers, or compounding into a pharmaceutically acceptable preparation. 

40 

Variants of a selected interacting domain (SID®) polypeptide and nucleic acids encoding them. 

[0066] As intended herein, a variant of a Selected Interacting Domain (SID®) polypeptide may be either a variant 
polypeptide of the Selected Interacting Domain (SID®) polypeptide or a polypeptide which is encoded by a nucleic 

45 acid variant of the polynucleotide encoding said Selected Interacting Domain (SID®) polypeptide. 

[0067] Polynucleotides which encode a polypeptide variant of a Selected Interacting Domain (SID®) polypeptide, as 
the term is used herein, are polynucleotides that differ from the reference polynucleotide encoding the parent SID® 
polypeptide. A variant of a polynucleotide may be a naturally occurring variant such as a naturally occurring allelic 
variant, or it may be a variant that is not known to occur naturally. Such non-naturally occurring variants of the reference 

50 polynucleotide may be generated by mutagenesis techniques, including those applied to polynucleotides, cells or or- 
ganisms well known to one skilled in the art. 

[0068] Generally, differences are limited so that the nucleotide sequences of the reference and the variant are closely 
similar overall and, in many regions, identical. 

[0069] Variants of polynucleotides according to the invention include, without being limited to, nucleotide sequences 
55 which are at least 95% identical after optimal alignment to the reference polynucleotide of SEQ ID N°39 to 76 encoding 
the reference Selected Interacting Domain (SID®) polypeptide, preferably at least 96%, 97%, 98% and most preferably 
at least 99% identical to the reference polynucleotide. Similarly, a variant of a SID® polypeptide of the invention consists 
of a polypeptide having at least 95% aminoacid identity with a polypeptide selected from the aminoacid sequences 
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SEQ ID N°1 to 38, and preferably at least 96%, 97%, 98% and most preferably at least 99% aminoacid identity with 
one of SEQ ID N°1 to 38. 

[0070] Identity refers to sequence identity between two peptides or between two nucleic acid molecules. Identity 
between sequences can be determined by comparing a position in each of the sequences which may be aligned for 

5 purposes of comparison. When a position in the compared sequences is occupied by the same base or amino acid, 
then the sequences are identical at that position. A degree of identity between nucleic acid sequences is a function of 
the number of identical nucleotides at positions shared by these sequences. A degree of identity between amino acid 
sequences is a function of the number of identical aminoacids at positions shared by these sequences. Since two 
polynucleotides may each (1) comprise a sequence (i.e., a portion of the complete polynucleotide sequence) that is 

10 similar between the two polynucleotides, and (2) may further comprise a sequence that is divergent between the two 
polynucleotides, sequence comparisons between two (or more) polynucleotides are typically performed by comparing 
sequences of the two polynucleotides over a " comparison window " to identify and compare local regions of sequence 
similarity. A " comparison window", as used herein, refers to a conceptual segment of at least 20 contiguous nucleotide 
positions wherein a polynucleotide sequence may be compared to a reference sequence of at least 20 contiguous 

15 nucleotides and wherein the portion of the polynucleotide sequence in the comparison window may comprise additions 
or deletions (i.e., gaps) of 20 percent or less as compared to the reference sequence (which does not comprise additions 
or deletions) for optimal alignment of the two sequences. Optimal alignment of sequences for determining a comparison 
window may be conducted by the local homology algorithm of Smith and Waterman (1 981 ), by the homology alignment 
algorithm of Needleman and Wunsch (1972), by the search for similarity method of Pearson and Lipman (1988), by 

20 computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics 
Solftware Package Release 7.0, Genetics Computer Group, 575, Science Dr. Madison, W1), or by inspection. The 
best alignment (i.e., resulting in the highest percentage of identity over the comparison window) generated by the 
various methods is selected. The term " sequence identity" means that two polynucleotide sequences are identical (i. 
e., on a nucleotide-by-nucleotide basis) over the window of comparison. The term " percentage of sequence identity" 

25 is calculated by comparing two optimally aligned sequences over the window of comparison, determining the number 
of positions at which the identical nucleic acid base (e.g. A, T, C, G, U or I) occurs in both sequences to yield the number 
of matched positions, dividing the number of matched positions by the total number of positions in the window of 
comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence identity. 
[0071] Most preferably, the percentage of nucleic acid or aminoacid identity between two nucleic acid or aminoacid 

30 sequences is calculated using the BLAST software (Version 2.06 of September 1998) with the default parameters. 
[0072] Nucleotide changes present in a variant polynucleotide may be silent, which means that they do not alter the 
aminoacid encoded by the reference polynucleotide. 

[0073] However, nucleotide changes may also result in aminoacid substitutions, additions, deletions, fusions and 
truncations in the Selected Interacting Domain (SD®) polypeptide encoded by the reference sequence. 
35 [0074] The substitutions, deletions or additions may involve one or more nucleotides. Alterations may produce con- 
servative or non-conservative aminoacid substitutions, deletions or additions. 

[0075] Most preferably, the variant of a Selected Interacting Domain (SID®) polypeptide encoded by a variant poly- 
nucleotide possesses at least the same affinity of binding to its protein or polypeptide counterpart, against which it has 
been initially selected as described above. 
40 [0076] The affinity of a given SID® polypeptide of the invention for a polypeptide into which it specifically binds is 
defined as the affinity constant Ka, wherein 

_ [S I D®/poly peptide complex] 
3 ~ [free SID®] [free polypeptide] 

45 

with [free SID®], [free polypeptide] and [SID®/polypeptide complex ] consist of the concentrations at equilibrium re- 
spectively of the free SID® polypeptide, of the free polypeptide onto which the SID® polypeptide specifically binds and 
of the complex formed between the SID® polypeptide and the polypeptide onto which said SID® polypeptide specifically 
binds. 

so [0077] Most preferably, the affinity of a SID® polypeptide of the invention or a variant thereof for its polypeptide 
counterpart (polypeptide partner) is assessed on a Biacore™ apparatus marketed by Amercham Pharmacia Biotech 
Company such as described by SZABO et at. (1995) and by Edwards and Leartherbarrow (1997). 
[0078] As used herein, the expression « at least the same affinity» with reference to the affinity of binding between 
a SID® polypeptide of the invention to another polypeptide means that the Ka is identical or is of at least two-fold, 

55 preferably at least three-fold and most preferably at least five-fold greater than the Ka value of reference. 

[0079] In another preferred embodiment, the variant of a Selected Interacting Domain (SID®) polypeptide which is 
encoded by a variant polynucleotide of the invention possesses a higher specificity of binding to its counterpart polypep- 
tide or protein than the reference Selected Interacting Domain (SID®) polypeptide. 
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[0080] A variant of a Selected Interacting Domain (SID®) polypeptide according to the invention may be (1 ) one in 
which one or more, most preferably from one to three, of the aminoacid residues are substituted with a conserved or 
a non-conserved aminoacid residue and such substituted aminoacid residue may or may not be one encoded by the 
genetic code, or (2) one in which one or more of the aminoacid residues includes a substituent group. 

5 [0081] In the case of an aminoacid substitution in the aminoacid sequence of a Selected Interacting Domain (SID®) 
polypeptide according to the invention, one or several-consecutive or nonconsecutive - aminoacids are replaced by 
" equivalent " aminoacids. The expression " equivalent" aminoacid is used herein to designate any aminoacid that may 
be substituted for one of the aminoacids belonging to the native Selected Interacting Domain (SID®) polypeptide struc- 
ture without decreasing the binding properties of the corresponding peptides to their counterpart polypeptide or protein, 

10 as regards the reference Selected Interacting Domain (SID®) polypeptide. 

[0082] These equivalent aminoacids may be determined either by their structural homology with the initial aminoacids 
to be replaced, by the similarity of their net charge or of their hydrophobicity. 

[0083] By an equivalent aminoacid according to the present invention is also meant the replacement of a residue in 
the L-form by a residue in the D-form or the replacement of a glutamic acid residue by a pyroglutamic acid compound. 

15 The synthesis of peptides containing at least one residue in the D-form is, for example, described by KOCH (1977). A 
specific embodiment of a variant of a Selected Interacting Domain (SID®) polypeptide according to the invention in- 
cludes, but is not limited to, a peptide molecule which is resistant to proteolysis, such as a peptide in which the -CONH- 
peptide bond is modified and replaced by a (-CH 2 NH-) reduced bond, a (-NHCO-) retroinverso bond, a (-CH 2 -0-) 
methylene-oxy bond, a (-CH 2 -S-) thiomethylene bond, a (-CH 2 CH 2 -) carba bond, a (-CO-CH 2 ) hydroxyethylene bond, 

20 a (-N-N-) bond or also a -CH=CH bond. 

[0084] As used herein, a variant of a SID® polypeptide of the invention also encompasses a polypeptide having an 
aminoacid sequence consisting of at least: 

45 consecutive aminoacids of SEQ ID N°1; 
25 - 30 consecutive aminoacidss of SEQ ID N°2; 

65 consecutive aminoacids of SEQ ID N°3; 

30 consecutive aminoacids of SEQ ID N°4; 

130 consecutive aminoacids of SEQ ID N°5; 

25 consecutive aminoacids of SEQ ID N°6; 
30 - 23 consecutive aminoacids of SEQ ID N°7. 

48 consecutive aminoacids of SEQ ID N°8; 

36 consecutive aminoacids of SEQ ID N°9; 

25 consecutive aminoacids of SEQ ID N°10; 

24 consecutive aminoacids of SEQ ID N°11; 
35 - 37 consecutive aminoacids of SEQ ID N°12; 

25 consecutive aminoacids of SEQ ID N°13; 
30 consecutive aminoacids of SEQ ID N°14; 
27 consecutive aminoacids of SEQ ID N°15; 
69 consecutive aminoacids of SEQ ID N°16; 

40 - 130 consecutive aminoacids of SEQ ID N° 17; 

33 consecutive aminoacids of SEQ ID N°18; 

25 consecutive aminoacids of SEQ ID N°19; 
40 consecutive aminoacids of SEQ ID N°20; 
78 consecutive aminoacids of SEQ ID N°21; 

45 - 39 consecutive aminoacids of SEQ ID N°22; 

57 consecutive aminoacids of SEQ ID N°23; 

26 consecutive aminoacids of SEQ ID N°24; 
68 consecutive aminoacids of SEQ ID N°25; 

34 consecutive aminoacids of SEQ ID N°26; 
50 - 42 consecutive aminoacids of SEQ ID N°27;. 

48 consecutive aminoacids of SEQ ID N°28. 
102 consecutive aminoacids of SEQ ID N°29: 

49 consecutive aminoacids of SEQ ID N°30: 
92 consecutive aminoacids of SEQ ID N° 31; 

55 - 49 consecutive aminoacids of SEQ ID N°30; 

92 consecutive aminoacids of SEQ ID N°31 ; 
71 consecutive aminoacids of SEQ ID N°32; 
55 consecutive aminoacids of SEQ ID N°33; 
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69 consecutive aminoacids of SEQ ID N°34; 
23 consecutive aminoacids of SEQ ID N°35; 
33 consecutive aminoacids of SEQ ID N°36; 
32 consecutive aminoacids of SEQ ID N°37; 

5 

and 

22 consecutive aminoacids of SEQ ID N°38. 

10 [0085] Without wishing to be bound by any particular theory, the inventors believe that polypeptides having an ami- 
noacid length of about 1 0% lesser than the aminoacid length of anyone of the SID® polypeptides of SEQ ID N°1 to 39 
of the invention have a high probability to retain the binding properties to a given (bait) polypeptide of the parent SID® 
polypeptide. 

[0086] The invention also pertains to a nucleic acid encoding a SID® polypeptide which is selected from the group 
15 consisting of the sequences SEQ ID N°39 to 76, and a sequence complementary thereto. 

[0087] The invention is also directed to a nucleic acid encoding a variant of SID® polypeptide selected from the 
group consisting of the sequences SEQ ID N°39 to 76, in reference to the definition of the SID® polypeptide variants 
above. 

[0088] For example, a nucleic acid encoding a polypeptide having an aminoacid sequence consisting of at least 45 
20 consecutive aminoacids of SEQ ID N°1 comprise at least 135 (45 x 3) consecutive nucleotides of the polynucleotide 
of SEQ ID N°39. 

[0089] The same definition also apply for nucleic acids encoding variants of the SID® polypeptides of SEQ ID N°2 
to 38, which are part of the invention. 

[0090] The invention further relates to a nucleic acid encoding a polypeptide having an aminoacid sequence com- 
25 prising from 1 to 3 substitutions, additions or deletions of one aminoacid as regards a polypeptide selected from the 
group consisting of the aminoacid sequences SEQ ID N°1 to 38 or a sequence complementary thereto. 
[0091] Another object of the invention consists of a polypeptide selected from the group consisting of the aminoacid 
sequences SEQ ID N°39 to 76 or a variant thereof. 

[0092] Are encompassed in the family of variants of a SID® polypeptide of the invention those polypeptides having 
30 an aminoacid sequence comprising from 1 to 3 substitutions, additions or deletions of one aminoacid as regards a 
polypeptide selected from the group consisting of the aminoacid sequences SEQ ID N°1 to 38. 
[0093] The invention is also directed to an antibody directed against a a SID® polypeptide as defined above, or to 
a variant thereof. 

[0094] The antibodies directed specifically against the Selected Interacting Domain (SID®) polypeptide or a variant 

35 thereof may be indifferently radioactively or non-radioactively labelled. 

[0095] Monoclonal antibodies directed against a SID® polypeptide may be prepared from hybridomas according to 
the technique described by Kohler and Milstein in 1975. Polyclonal antibodies may be prepared by immunization of a 
mammal, especially a mouse or a rabbit, with the SID® polypeptide that is combined with an adjuvant of immunity, and 
then by purifying the specific antibodies contained in the serum of the immunized animal on a affinity chromatography 

40 column on which has previously been immobilized the polypeptide that has been used as the antigen. 

[0096] Antibodies directed against a SID® polypeptide may also be produced by the trioma technique and by the 
human B-cell hybridoma technique (Kozbor et al., 1983). 

[0097] Antibodies directed to a SID® polypeptide include chimeric single chain Fv antibody fragments (US Patent 
N° US 4,946,778; Martineau et al., 1998), antibody fragments obtained through phage display libraries (Ridder et al., 
45 1 995) and humanized antibodies (Reinmann et al., 1997; Legeretal., 1997). Also, transgenic mice, or other organisms 
such as other mammals, may be used to express antibodies, including for example, humanized antibodies directed 
against a SID® polypeptide of the invention, or a variant thereof. 

VECTORS OF THE INVENTION 

50 

[0098] The nucleic acids coding for a Selected Interacting Domain (SID®) polypeptide or a variant thereof, which 
are defined in the section above, can be inserted into an appropriate expression vector, i.e., a vector which contains 
the necessary elements for the transcription and translation of the inserted protein-coding sequence. Such transcription 
elements include a regulatory region and a promoter as defined previously. Thus, the nucleic acid encoding a marker 
55 compound of the invention is operably linked with a promoter in a expression vector, wherein said expression vector 
may include a replication origin. 

[0099] The necessary transcriptional and translation of signals is most preferably provided by the recombinant ex- 
pression vector. 
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Structure of the vectors encompassed by the invention 

[0100] A wide variety of host/expression vector combinations may be employed in expressing the nucleic acids of 
this invention. Useful expression vectors, for example, may consist of segments of chromosomal, non-chromosomal 

5 and synthetic DNA sequences. Suitable vectors include derivatives of SV40 and known bacterial plasmids, e.g., Es- 
cherichia coii plasmids col E1, pCR1, pBR322, pMal-C2, pET, pGEX (Smith et a/., 1988), pMB9 and their derivatives, 
plasmids such as RP4; phage DNAs, e.g., the numerous derivatives of phage I, e.g., NM989, and other phage DNA, 
e.g., M13 and filamentous single stranded phage DNA; yeast plasmids such as the 2m plasmid or derivatives thereof; 
vectors useful in eukaryotic cells, such as vectors useful in insect or mammalian cells; vectors derived from combina- 

10 tions of plasmids and phage DNAs, such as plasmids that have been modified to employ phage DNA or other expression 
control sequences; and the like. 

[0101] For example, in a baculovirus expression system, both non-fusion transfer vectors, such as but not limited to 
pVL941 (BamHI cloning site; Summers), pVL1393 (BamHI, Sma\, Xba\, EcoRI, A/ofl, XmalU, Bgill, and Psfl cloning 
site; Invitrogen), pVL1392 (Bg/ll, Psfl, A/ofl, Xmalll, EcoR\, Xba\ t Sma\, and SamHI cloning site; Summers and Invit- 

15 rogen), and pBlueBaclll (BamHI, Bg/ll, Psfl, A/col, and HindlU cloning site, with blue/white recombinant screening 
possible; Invitrogen), and fusion transfer vectors, such as but not limited to pAc700 (BamHI and Kpnl cloning site, in 
which the BamHI recognition site begins with the initiation codon; Summers), pAc701 and pAc702 (same as pAc700, 
with different reading frames), pAc360 (BamHI cloning site 36 base pairs downstream of a polyhedrin initiation codon; 
lnvitrogen(195)), and pBlueBacHisA, B, C (three different reading frames, with BamHI, Sg/ll, Psfl, A/col, and Hind\\\ 

20 cloning site, an N-terminal peptide for ProBond purification, and blue/white recombinant screening of plaques; Invitro- 
gen (220) can be used. 

[0102] Mammalian expression vectors contemplated for use in the invention include vectors with inducible promoters, 
such as the dihydrofolate reductase (DHFR) promoter, e.g., any expression vector with a DHFR expression vector, or 
a DHFR/methotrexate co-amplification vector, such as pED (Psfl, Sa/I, Soal, Smal, and EcoRI cloning site, with the 

25 vector expressing both the cloned gene and DHFR; Kaufman, 1991 ). Alternatively, a glutamine synthetase/methionine 
sulfoximine co-amplification vector, such as pEE14 (HindlU, Xbal, Smal, Soal, EcoRI, and Sc/I cloning site, in which 
the vector expresses glutamine synthase and the cloned gene; Celltech). In another embodiment, a vector that directs 
episomal expression under control of Epstein Barr Virus (EBV) can be used, such as pREP4 (BamHI , Sfil, Xhol A/ofl, 
Nhel HindlU, Nhel, PvuW, and Kpn\ cloning site, constitutive RSV-LTR promoter, hygromycin selectable marker; Inv- 

30 itrogen), pCEP4 (SamHI , Sfi\, Xhol, A/ofl, Nhe\, HindlU, Nhel, PvuW, and Kpn\ cloning site, constitutive hCMV immediate 
early gene, hygromycin selectable marker; Invitrogen), pMEP4 (Kpnl, Pvul, Nhe\, HindlU, Not\, Xho\, Sfil, SamHI 
cloning site, inducible methallothionein I la gene promoter, hygromycin selectable marker: Invitrogen), pREP8 (BamHI , 
Xho\, A/pfl, HindlU, Nhel, and Kpn\ cloning site, RSV-LTR promoter, histidinol selectable marker; Invitrogen), pREP9 
(Kpn\, Nhel, HindlU, A/ofl, Xho\, Sfi\, and BamHI cloning site, RSV-LTR promoter, G41 8 selectable marker; Invitrogen), 

35 and pEBVHis (RSV-LTR promoter, hygromycin selectable marker, N-terminal peptide purifiable via ProBond resin and 
cleaved by enterokinase; Invitrogen). Selectable mammalian expression vectors for use in the invention include pRc/ 
CMV {HindlU, BsfXI, A/ofl, Soal, and Apa\ cloning site, G418 selection; Invitrogen), pRc/RSV (HindlU, Spel, BstXl, A/ofl, 
Xbal cloning site, G418 selection; Invitrogen), and others. Vaccinia virus mammalian expression vectors (see, Kaufman, 
1991, supra) for use according to the invention include but are not limited to pSC11 (Smai cloning site, TK- and b-gal 

40 selection), pMJ601 (Sail, Smal, Afll, Nar\, SspMII, SamHI, Apal, Nhel, Sacll, Kpnl, and HindlU cloning site; TK- and b- 
gal selection), and pTKgptFIS (EcoRI, Psfl, Sa/I, /\ccl, Hindll, Sba\, BamHI, and Hpa cloning site, TK or XPRT selec- 
tion). 

[0103] Yeast expression systems can also be used according to the invention to express a Selected Interacting 
Domain (SID®) polypeptide or a variant thereof and also a marker compound as defined herein. For example, the non- 
45 fusion pYES2 vector (Xbal, Sphl, Shol, A/ofl, GsfXI, EcoRI, BsfXI, SamHI, Sad, Kpnl, and HindlU cloning sit; Invitro- 
gen) or the fusion pYESHisA, B, C (Xbal, Spbl, Shot, A/ofl, SsfXI, EcoRI, BamHI, Sad, Kpnl, and HindlU cloning site, 
N-terminal peptide purified with ProBond resin and cleaved with enterokinase; Invitrogen), to mention just two, can be 
employed according to the invention. 

[0104] Once a suitable host system and growth conditions are established, recombinant expression vectors can be 
50 propagated and prepared in quantity. As previously explained, the expression vectors which can be used include, but 
are not limited to, the following vectors or their derivatives: human or animal viruses such as vaccinia virus or adeno- 
virus; insect viruses such as baculovirus; yeast vectors; bacteriophage vectors (e.g., lambda), and plasmid and cosmid 
DNA vectors, to name but a few. 

[0105] Vectors are introduced into the desired host cells by methods known in the art, e.g., transfection, electropo- 
55 ration, microinjection, transduction, cell fusion, DEAE dextran, calcium phosphate precipitation, lipofection (lysosome 
fusion), use of a gene gun, or a DNA vector transporter (see, e.g., Wu etal., 1992; Wu and Wu, 1988; Canadian Patent 
Application No. 2,012,311 , filed March 15, 1990). 

[01 06] A cell has been "transfected" by exogenous or heterologous DNA when such DNA has been introduced inside 
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the cell. A cell has been "transformed" by exogenous or heterologous DNA when the transfected DNA effects a phe- 
notypic change. 

[0107] For introducing a vector in a cell host, explicit reference is made to research carried out by the group of E. 
Wagner, relating to gene delivery by means of plasmid-polylysine complexes (Curiel et al., 1991; and Curiel et al. ( 

5 1 992). The plasmid-polylysine complex investigated upon exposition to certain cell lines showed at least some expres- 
sion of the gene. Further, it was found that the expression efficiency increased considerably due to the binding of 
transferrin to the plasmid-polylysine complex. Transferrin gives rise to close cell-complex contact with cells comprising 
transferrin receptors; it binds the entire complex to the transferrin receptor of cells. Subsequently, at least part of the 
entire complex was found to be incorporated in the cells investigated. 

10 [0108] Several different approaches have been developed for gene transfer. These include the use of viral based 
vectors (e.g., retroviruses, adenoviruses, and adeno-associated viruses) (Drumm, M. L. et al., Rosenfeld, M. A. et al., 
1992; and Muzyczka, 1992), charge associating the DNA with an asialorosomucoid/poly L-lysine complex (Wilson, J. 
M. etal. 1992), charge associating the DNA with cationic liposomes (Brigham, K. L. etal., 1993) and the use of cationic 
liposomes in association with a poly-L-lysine antibody complex (Trubetskoy, V. S. et al., 1993). 

15 

Compositions comprising vectors of the invention. 

[0109] Although non-viral based transfection systems have not exhibited the efficiency of viral vectors, they have 
received significant attention, in both in vitro and in vivo research, because of their theoretical safety when compared 

20 to viral vectors. Synthetic cationic molecules, have been reported which reportedly "coat" the nucleic acid through the 
interaction of the cationic sites on the transfection agent and the anionic sites on the nucleic acid. The positively charged 
coating reportedly interacts with the negatively charged cell membrane to facilitate the passage of the nucleic acid 
through the eel! membrane by non-specific endocytosis. (Schofield, 1995) These compounds have, however, exhibited 
considerable sensitivity to natural serum inhibition, which has probably limited their efficiency in vivo as gene trans- 

25 fection agents. (Behr 1994) 

[01 10] A number of attempts have been made to improve the efficiency of lipid-like cationic transfection agents, some 
involving the use of polycationic molecules. For example, several transfection agents have been developed that contain 
the polycationic compound spermine covalently attached to a lipid carrier. (Behr, 1994), discloses a lipopolyamine and 
shows it to be more efficient at transfecting cells than single charge molecules (albeit still less efficient than viral vectors). 

30 The agent reported by Behr was, however, toxic, and caused cell death. 

[0111] A few such lipid delivery systems for transporting DNA, proteins, and other chemical materials across mem- 
brane boundaries have been synthesized by research groups and business entities. Most of the synthesis schemes 
are relatively complex and generate lipid based delivery systems having only limited transfection abilities. A need exists 
in the field of gene therapy for cationic lipid species that have a high biopolymer transport efficiency. It has been known 

35 for some time that a very limited number of certain quaternary ammonium derivatized (cationic) liposomes spontane- 
ously associate with DNA, fuse with cell membranes, and deliver the DNA into the cytoplasm (as noted above, these 
species have been termed "cytofectins"). LIPOFECTIN TM. represents a first generation of cationic liposome formu- 
lation development. LIPOFECTIN TM is composed of a 1 :1 formulation of the quaternary ammonium containing com- 
pound DOTMA and dioleoylphosphatidylethanolamine sonicated into small unilamellar vesicles in water. Problems 

40 associated with LIPOCFECTIN TM include non-metabolizable ether bonds, inhibition of protein kinase C activity, and 
direct cytotoxicity. In response to these problems, a number of other related compounds have been developed. The 
monoammonium compounds of the subject invention improve upon the capabilities of existing cationic liposomes and 
serve as a very efficient delivery system for biologically active chemicals. 

45 Most preferred vectors of the invention. 

[0112] Most preferred recombinant vectors according to the invention include pASAA(figure 2), pACTIIst (figure 3), 
pT18 (figure 4), pUT18C (figure 5), pT25 (figure 6), pKT25(figure 7), pB5 (Figure 12) and pP6 (Figure 13) containing 
inserted therein a nucleic acid encoding a Selected Interacting Domain (SID®) polypeptide or a variant thereof as 
50 defined above. 

[0113] The present invention is also directed to a vector usable in a two-hybrid method which consists of the vector 
pP6 which is shown in figure 13. As disclosed in example 1, the vector pP6 has been successfully used for preparing 
a collection of recombinant plasmids consisting of a genomic DNA library from the pathogenic strain H77 of the hepatitis 
C virus. 

55 [0114] The invention also pertains to a vector usable in two-hybrid method which consists of the vector pB5. As 
disclosed in example 1, the vector pB5 has been successfully used in a yeast two hybrid method as a bait plasmid. 
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RECOMBINANT CELL HOSTS 

[0115] In one embodiment, a Selected Interacting Domain (SID®) polypeptide of the invention or a variant thereof 
is recombinantly produced in a desired host cell which has been transfected or transformed with a nucleic acid encoding 
5 said Selected Interacting Domain (SID®) polypeptide or with a recombinant vector as defined above within which a 
nucleic acid encoding a Selected Interacting Domain (SID®) polypeptide of the invention is inserted. 
[0116] Recombinant cell hosts are another aspect of the present invention. 

[01 1 7] Such cell hosts generally comprise at least one copy of a nucleic acid encoding a Selected Interacting Domain 
(SID®) polypeptide of the invention or a variant thereof 

10 [0118] Preferred cells for expression purposes will be selected in function of the objective which is sought. For ex- 
ample, in the embodiment wherein the production of a Selected Interacting Domain (SID®) polypeptide according to 
the invention in large quantities is sought, the nature of the host cell used for its production is relatively indifferent, 
provided that large amounts of Selected Interacting Domain (SID®) polypeptides of the invention are produced and 
that optional further purification steps may be carried out easily. 

15 [0119] However, in the embodiment wherein the Selected Interacting Domain (SID®) polypeptide is recombinantly 
produced within a host organism for the purpose of interfering with a specific protein-protein interaction, then the host 
organism is selected among the host organisms which are suspected to produce naturally said polypeptide of interest. 
[0120] Consequently, mammalian and typically human cells, as well as bacterial, yeast, fungal, insect, nematode 
and plant cells are cell hosts encompassed by the invention and which may be transfected either by a nucleic acid or 

20 a recombinant vector as defined above. 

[0121] Examples of suitable recombinant host cells include VERO cells, HELA cells (e.g. ATCC N°CCL2), CHO cell- 
lines (e.g. ATCC N°CCL61 ) COS cells (e.g. COS-7 cells; COS cell referred to ATCC N°CRL1650), W138, BHK, HepG2, 
3T3 (e.g. ATCC N°CRL6361), A549, PC12, K562 cells, 293 cells, Sf9 cells (e.g. ATCC N°CRL1711) and Cv1 cells (e. 
g. ATCC N°CCL70). 

25 [0122] Other suitable host cells are usable according to the invention include prokaryotic host cells strains of Es- 
cherichia coii (e.g. strain DH5-a), of Bacillus subtilis, of Salmonella typhimurium, or strains of genera such as Pseu- 
domonas, Streptomyces and Staphylococcus. 

[0123] Further suitable host cells usable according to the invention include yeast cells such as those of Saccharo- 
myces, typically Saccharomyces cerevisiae. 
30 [0124] The invention also relates to a method for producing a SID® polypeptide as defined above, wherein said 
method comprises the steps of: 

a) cultivating a cell host which has been transformed with a SID® nucleic acid of the invention or with a vector 
containing a SID® nucleic acid in an appropriate culture medium; 

35 b) recovering the SID® recombinant polypeptide from the culture supernatant or from the cell lysate. 

[0125] The SID® polypeptides or variant thereof thus recombinantly obtained may be purified, for example by high 
performance liquid chromatography, such as reverse phase and/or cationic exchange HPLC, as described by ROUGE- 
OT et al. (1994). The reason to prefer this kind of peptide or protein purification is the lack of by-products found in the 
elution samples which renders the resultant purified protein more suitable for a therapeutic use. 

TWO-HYBRID METHODS OF THE INVENTION 

a) Yeast two-hybrid methods 

45 

[0126] The invention also pertains to a yeast two-hybrid method for selecting a recombinant cell clone containing a 
vector comprising a nucleic acid insert encoding a prey polypeptide which binds with a SID® polypeptide of SEQ ID 
N°1 to 38 or a variant thereof, wherein said method comprises the steps of: 

50 a) mating at least one first recombinant yeast cell clone of a collection of recombinant yeast cell clones transformed 

with a plasmid containing the prey polynucleotide to be assayed with a second aploTd recombinant Saccharomyces 
cerevisiae cell clone transformed with a plasmid containing a bait polynucleotide encoding a SID® polypeptide of 
the invention or a variant thereof; 

b) cultivating diploid cells obtained in step a) on a selective medium; and 
55 c) selecting recombinant cell clones which grow on said selective medium. 

The yeast two-hybrid method above may further comprise the step of: 
d) characterizing the prey polynucleotide contained in each recombinant cell clone selected in step c). 
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[0127] Most preferably, such a yeast two-hybrid method may be performed by the one skilled in the art as it is dis- 
closed in example 2 hereafter. 

[0128] According to the yeast two-hybrid method above, a SID® polypeptide of the invention or a variant thereof is 
used as a bait polypeptide. 

5 [0129] In a preferred embodiment of the yeast two-hybrid method described above, the prey polynucleotide is a DNA 
fragment from the genome of a pathogenic strain of the hepatitis C virus (HCV) ranging from about 150 to about 600 
nucleotides in length and which is inserted in a vector which is contained in one recombinant clone of a collection of 
recombinant cell clones. 

10 b) Bacterial two-hybrid method 

[0130] A bacterial two-hybrid method of the invention may be performed by the one skilled in the art according to 
the teachings of KARIMOVA et al. (1998). 

[0131] The first step of selecting a collection of nucleic acids encoding polypeptides which binds specifically to the 
15 bait polypeptide may also be carried out through a bacterial two-hybrid system. 

[0132] According to such bacterial two-hybrid system, bacterial cell clones, preferably Escherichia coii cells, are 
transformed with a plasmid containing a bait polynucleotide encoding a bait polypeptide. 

[0133] Then, plasmids containing a DNA insert are provided by rescuing the plasmids obtained from the collection 
of yeast clones containing the genomic DNA or cDN A library which are described in the previous section entitled " Yeast 
20 two-hybrid system For example, the plasmid rescue may be carried out according to the following steps: 

(i) extracting plasmid DNA contained in the collection of yeast clones obtained as disclosed in the previous section, 
by using a conventional DNA extraction buffer and a phenol: chloroform: isoamyl alcohol (25:24:1) before centri- 
fuging; 

25 (ii) transferring a desired volume of the supernatant obtained at the end of step (i) to a sterile Eppendorf tube and 

add a precipitation buffer (ethanol/NH 4 Ac) before centrifuging and resuspending the pellet after washing in ethanol; 
(iii) transforming Escherichia coii cells (e.g. Escherichia coii cells of strain NC 1066) which have been rendered 
electrocompetent with a desired volume (e.g. 1 u.l) of the yeast plasmid DNA extract obtained at step (ii) by elec- 
troporation; 

30 (jv) collecting the transformed Escherichia coii cells. 

[0134] Alternatively, a collection of Escherichia coii cell clones containing a collection of HCV genomic DNA inserts 
may be obtained by constructing the DNA library directly in the bacterial cell, such as disclosed in Flajolet et al. (2000). 
[0135] Then, the bacterial recombinant cells which have been transformed both with a plasmid containing a bait 
35 polynucleotide encoding a bait polypeptide and a plasmid containing a prey polynucleotide encoding a prey polynu- 
cleotide is cultivated on a selective medium. 

[01 36] Then, recombinant cell clones capable of growing on said selective medium are selected and the DNA inserts 
of the plasmids containing therein are sequenced. 

[01 37] By bacterial two-hybrid system is generally intended a method that usually makes use of at least one reporter 
40 gene, the transcription of which is activated when a prey polypeptide and a bait polypeptide produced by the recom- 
binant cell due to the triggering of the transcription of said at least one reporter gene when both the specific domain 
contained in one prey polypeptide and the complementary domain contained in the bait polypeptide are binding one 
to the other. 

[0138] The invention further pertains to a bacterial two-hybrid method for identifying a recombinant cell clone con- 
45 taining a prey polynucleotide encoding a prey polypeptide which binds with a SID® polypeptide of SEQ ID N°1 to 38 
or a variant thereof, wherein said method comprises the steps of: 

a) transforming bacterial cell clones with a plasmid containing a SID® polynucleotide encoding a SID® polypeptide 
of the invention or a variant thereof; 
so b) rescuing prey plasmids containing prey polynucleotides wherein each prey polynucleotide is a DNA fragment 

from the genome of a desired organism and wherein each prey plasmid is contained in one recombinant yeast cell 
clone of a collection of recombinant yeast cell clones; 

c) transforming the recombinant bacterial cell clones obtained in step a) with the plasmids rescued in step b); 

d) cultivating bacterial recombinant cells obtained in step c) on a selective medium; 

55 

and 

e) selecting recombinant cell clones which grow on said selective medium. 
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[0139] The bacterial two-hybrid system described above may further comprise the step of f) characterizing the prey 
polynucleotide contained in each recombinant cell clone selected at step e). 

[0140] In one preferred embodiment of the yeast or bacterial two-hybrid methods described above, the prey polypep- 
tide is a human polypeptide expressed by a mammal which is infected by the Hepatitis C virus, like human and monkeys, 
5 typically chimpanzees. 

[0141] Generally, the yeast two-hybrid method or the bacterial two-hybrid method as disclosed herein may be per- 
formed with prey polypeptides of any origin, either of viral, fungal, bacterial or mammal origin, i.e. either of prokaryotic 
or eukaryotic origin. 

[0142] In a second preferred embodiment of the two-hybrid methods above, the prey polypeptide is an HCV polypep- 
10 tide. 

[0143] Most preferably, the prey polypeptide is encoded by a strain of the hepatitis C virus which is pathogenic for 
human, such as strain H77. 

SETS OF NUCLEIC ACIDS AND SETS OF POLYPEPTIDES OF THE INVENTION 

15 

[0144] In yet another aspect, the present invention relates to a set of two nucleic acids consisting of: 

i) a first nucleic acid encoding a SID® polypeptide of SEQ ID N° 1 to 39 of the invention or a variant thereof; and 

ii) a second nucleic acid encoding a prey polypeptide which binds specifically with a SID® polypeptide defined in i). 

20 

[0145] In still a further aspect, the invention is also directed to a set of two polypeptides consisting of: 

i) a first polypeptide consisting of a SID® polypeptide of SEQ ID N° 1 to 39 of the invention or a variant thereof; and 

ii) a second polypeptide which binds specifically with the first polypeptide. 

25 

[0146] The invention further relates to a complex formed between : 

i) a first polypeptide consisting of a SID® polypeptide of SEQ ID N°1 to N°38 of the invention; and 

ii) a second poplypeptide which binds specifically with the first polypeptide. 

30 

[0147] The invention also relates to a protein-protein interaction wherein the two interacting proteins consist of a set 
of two polypeptides as defined above. 

[0148] In a preferred embodiment, the invention relates to the protein-protein interactions wherein the sets of two 
polypeptides consist of a SID® polypeptide of SEQ ID N°1 to 38 and an HCV polypeptide. 

35 [0149] When several reiterations of the two-hybrid method are performed and thus common SID® polypeptide and 
prey polypeptides are selected, a map of all the interactions between these polypeptides may be designed, that take 
into account of the known and/or suspected biological function of each of the interacting polypeptides. 
[0150] Table 1 illustrates protein-protein interaction between the SID® polypeptides of SEQ ID N°1 to 38 and polypep- 
tides of SEQ ID N°77 to 113 which are encoded by the genome of strain H77 of the hepatitis C virus which is pathogenic 

40 for a mammal, like human or chimpanzee. 

[01 51] Thus, the data presented in table 1 disclose particular sets of nucleic acids as well as particular sets of polypep- 
tides which are encompassed by the present invention. 

[0152] For example, table 1 discloses that the nucleic acid of SEQ ID N°39 encodes the SID® polypeptide of SEQ 
ID N°1 which contains exclusively (100 %) an aminoacid sequence from the Core protein of HCV strain H77. 
45 [01 53] The nucleic acid of SEQ ID N°39 starts at the nucleotide in position 446 and ends at the nucleotide in position 
600 of the HCV genome which is described by YANAGI et al. (1997). 

[01 54] Table 1 also discloses that the SID® polypeptide of SEQ ID N° 1 is part of a set of polypeptides of the invention, 
wherein the second polypeptide of said set of polypeptides consists of the polypeptide of SEQ ID N°77 which is encoded 
by the nucleic acid sequence of SEQ ID N°114, which nucleic acid sequence has 87% of its sequence which is derived 
50 from the region of the H77 strain HCV DNA encoding the Core protein. 

[0155] Thus , a particular set of polypeptides according to the invention consists of: 

i) the polypeptide of SEQ ID N°1; and 

ii) the polypeptide of SEQ ID N°77. 

55 

[0156] The same reasoning apply for every set of polypeptides disclosed in table 1 , which are expressly part of the 
present invention. 

[0157] Similarly, a particular set of nucleic acids according to the invention consists of: 
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(i) the nucleic acid of SEQ ID N°39; and 

(ii) the nucleic acid of SEQ ID N°114. 

[01 58] The same reasoning apply for every set of nucleic acids disclosed in table 1 . which are expressly part of the 
5 present invention. 

[0159] Thus, particular sets of two polypeptides of the invention are respectively SEQ ID N°77/SEQ ID N°1; SEQ ID 

N°78/SEQ ID N°2; SEQ ID N°78/SEQ ID N°3; SEQ ID N°79/SEQ ID N°4; SEQ ID N°80/SEQ ID N°5; SEQ ID N°81/SEQ 

ID N°6; SEQ ID N°82/SEQ ID N°7; SEQ ID N°83/SEQ ID N°8; SEQ ID N°84/SEQ ID N°9; SEQ ID N°85/SEQ ID N°10; 

SEQ ID N°86/SEQ ID N°11; SEQ ID N°87/SEQ ID N°12; SEQ ID N°88/SEQ ID N°13; SEQ ID N°89/SEQ ID N°14; 
10 SEQ ID N°90/SEQ ID N°15; SEQ ID N°91/SEQ ID N°16; SEQ ID N°92/SEQ ID N°17; SEQ ID N°93/SEQ ID N°18; 

SEQ ID N°94/SEQ ID N°19; SEQ ID N°95/SEQ ID N°20; SEQ ID N°96/SEQ ID N°21; SEQ ID N°97/SEQ ID N°22; 

SEQ ID N°98/SEQ ID N°23; SEQ ID N°99/SEQ ID N°24; SEQ ID N°100/SEQ ID N°25. SEQ ID N°101/SEQ ID N°26. 

SEQ ID N°102/SEQ ID N°27; SEQ ID N°103/SEQ ID N°28. SEQ ID N°104/SEQ ID N°29; SEQ ID N°105/SEQ ID N°30; 

SEQ ID N°106/SEQ ID N°31; SEQ ID N°107/SEQ ID N°32; SEQ ID N°108/SEQ ID N°33; SEQ ID N°109/SEQID N°34; 
15 SEQ ID N°110/SEQ ID N°35; SEQ ID N°111/SEQ ID N°36; SEQ ID N°112/SEQ ID N°37; and SEQ ID N°113/SEQ ID 

N°38. 

[0160] Similarly, particular sets of two nucleic acids according to the invention are respectively: SEQ ID N°114/SEQ 
ID N°39; SEQ ID N°115/SEQ ID N°40; SEQ ID N°115/SEQ ID N°41; SEQ ID N°116/SEQ ID N°42; SEQ ID N°117/SEQ 
ID N°43; SEQ ID N°118/SEQ ID N°44; SEQ ID N°119/SEQ ID N°45; SEQ ID N°120/SEQ ID N°46; SEQ ID N°121/SEQ 

20 ID N°47; SEQ ID N°122/SEQ ID N°48; SEQ ID N°123/SEQ ID N°49; SEQ ID N°124/SEQ ID N°50; SEQ ID N°125/SEQ 
ID N°51 ; SEQ ID N°126/SEQ ID N°52; SEQ ID N°127/SEQ ID N°53; SEQ ID N°128/SEQ ID N°54; SEQ ID N°129/SEQ 
ID N°55; SEQ ID N°130/SEQ ID N°56; SEQ ID N°131/SEQ ID N°57; SEQ ID N°132/SEQ ID N°58; SEQ ID N°133/SEQ 
ID N°59; SEQ ID N°134/SEQ ID N°60; SEQ ID N°135/SEQ ID N°61; SEQ ID N°136/SEQ ID N°62; SEQ ID N°137/SEQ 
ID N°63; SEQ ID N°138/SEQ ID N°64; SEQ ID N°139/SEQ ID N°65; SEQ ID N°140/SEQ ID N°66; SEQ ID N°141/SEQ 

25 |D N°67; SEQ ID N°142/SEQ ID N°68; SEQ ID N°143/SEQ ID N°69; SEQ ID N°144/SEQ ID N°70. SEQ ID N°145/SEQ 
ID N°71 ; SEQ ID N°146/SEQ ID N°72. SEQ ID N°147/SEQ ID N°73; SEQ ID N°148/SEQ ID N°74; SEQ ID N°149/SEQ 
ID N°75 and SEQ ID N°150/SEQ ID N°76. 

[01 61 ] The protein-protein interactions disclosed in table 1 allows the design of a map of interactions between various 
polypeptides encoded by the genome of the H77 strain of HCV. 
30 [0162] In such a Protein Interaction Map (PIM®) wherein each SID® polypeptide is linked to the bait polypeptide 
onto which it specifically binds, for example by an arrow. 

[0163] Such a Protein Interaction Map (PIM®) may help the one skilled in the art to decipher a whole metabolical 
and/or physiological pathway that is functionally active within a pathogenic strain of HCV. Protein Interaction Map and 
computable version of PIM® are part of the present invention. 
35 [0164] Therefore, in still another aspect, the present invention is directed to a computable readable medium (such 
as floppy disk, CD-ROM and all electronic or magnetic format which can be read by a computer) having stored thereon 
protein-protein interactions according to the invention, preferably stored in a form of a Protein Interaction MAP, as 
shown, for example, in FROMONT-RACINE et al. (1997). 

[0165] In a preferred embodiment, the invention comprises a computable readable medium as defined above, where- 
to in the protein-protein interactions stored thereon are linked to annotated data base, for example through Internet. 

[0166] In another preferred embodiment, the invention comprises a data bank containing the protein-protein inter- 
actions stored thereon, said data bank being available on a world-wide web site. 

METHODS FOR SELECTING INHIBITORS OF PROTEIN-PROTEIN INTERACTIONS OF THE INVENTION 

45 

[0167] The transformed host cells as described above can also be used as models so as to study the interactions 
between a SID® polypeptide of the invention and its binding partner polypeptide, or between a SID® polypeptide of 
the invention and chemical or protein compounds which inhibit the binding between said SID® polypeptide and its 
binding partner polypeptide. 

50 [01 68] Example of a SID® polypeptide and its binding partner polypeptides are typically the sets of polypeptides of 
the invention which are described above. 

[0169] In particular, the transformed host cells of the invention may be used for the selection of molecules which 
interact with a SID® polypeptide as described herein, as cofactor or as inhibitor, in particular a competitive inhibitor, 
or alternatively having an agonist or antagonist activity on the protein-protein interaction wherein said SID® polypeptide 
55 is involved. Preferably, the said transformed host cells will be used as a model allowing, in particular, the selection of 
products which make it possible to prevent and/or to treat pathologies induced by the hepatitis C virus. 
[0170] Consequently, the invention also consists of a method for selecting a molecule which inhibits the protein- 
protein interaction of a set of two polypeptides as defined above, wherein said method comprises the steps of: 
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a) cultivating a recombinant host cell containing a reporter gene the expression of which is toxic for said recombinant 
host cell, said host cell being transformed with two vectors wherein: 

i) the first vector contains a nucleic acid comprising a polynucleotide encoding a first hybrid polypeptide con- 
5 taining one of said two-polypeptides and a DNA binding domain; 

ii) the second vector contains a nucleic acid comprising a polynucleotide encoding a second hybrid polypeptide 
containing the second of said two polypeptides and an activating domain capable of activating said toxic re- 
porter gene when the first and the second hybrid polypeptides are interacting; 

on a selective medium containing the molecule to be ested and allowing the growth of said recombinant host 
10 cell when the toxic reporter gene is not activated; and 

b) selecting the molecule which inhibits the growth of the recombinant host cell defined in step a). 

[01 71 ] The invention is also directed to a method for selecting a molecule which inhibits the protein-protein interaction 
15 of a set of two polypeptides as defined above, wherein said method comprises the steps of: 

a) cultivating a recombinant host cell containing a reporter gene the expression of which is toxic for said recombinant 
host cell, said host cell being transformed with two vectors wherein: 

20 j) the first vector contains a nucleic acid comprising a polynucleotide encoding a first hybrid polypeptide con- 

taining one of said two polypeptides and the first domain of an enzyme; 

ii) the second vector contains a nucleic acid comprising a polynucleotide encoding a second hybrid polypeptide 
containing the second of said two polypeptides and the second part of said enzyme capable of activating said 
toxic reporter gene when the first and the second hybrid polypeptides are interacting, said interaction recov- 
25 ering the catalytic activity of the enzyme; 

on a selective medium containing the molecule to be tested and allowing the growth of said recombinant host 
cell when the toxic gene is not activated; and 

b) selecting the molecule which inhibits the growth of the recombinant host cell defined in step a). 

30 

[0172] In a preferred embodiment, said toxic reporter gene that can be used for negative selection is URA3, CYH1 
or CYH2 gene. 

[0173] For example, a method for the screening of a molecule which inhibits the interaction between a SID® polypep- 
tide of the invention with its binding protein counterpart may comprise the following steps: 

35 

transform a permeabilized yeast cell with two vectors, respectively a first vector containing a SID® nucleic acid of 
the invention and a second vector containing a prey nucleic acid as defined in the present specification; 
plate on top agar the transformed permeabilized yeast cells above on square boxes; 
apply by spotting the candidate inhibitor molecules to test on top agar as soon as it is solidified; 
40 - incubates, for example, overnight at 30°C, and 

select the inhibitor compounds that allow the growth of the transformed yeast cells. 

[0174] The invention also provides for a kitforthe screening of a molecule which inhibits the protein-protein interaction 
of a set of two polypeptides as defined above, wherein said kit comprises a recombinant host cell containing a reporter 
45 gene the expression of which is toxic for said recombinant host cell, said host cell being transformed with two vectors 
wherein: 

i) the first vector contains a nucleic acid comprising a polynucleotide encoding a first hybrid polypeptide containing 
one of said two polypeptides and a DNA binding domain; 
50 jj) the second vector contains a nucleic acid comprising a polynucleotide encoding a second hybrid polypeptide 

containing the second of said two polypeptides and an activating domain capable of activating said toxic reporter 
gene when the first and the second hybrid polypeptides are interacting. 

[0175] Another object of the invention consists of a kit for the screening of a molecule which inhibits the protein- 
55 protein interaction of a set of two polypeptides as defined above, wherein said kit comprises a recombinant host cell 
containing a reporter gene the expression of which is toxic for said recombinant host cell, said host cell being trans- 
formed with two plasmids wherein: 
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i) the first vector contains a nucleic acid comprising a polynucleotide encoding a first hybrid polypeptide containing 
one of said two polypeptides and the first domain of a protein; 

ii) the second vector contains a nucleic acid comprising a polynucleotide encoding a second hybrid polypeptide 
containing the second of said two polypeptides and the second part of said protein capable of activating said toxic 

5 reporter gene when the first and the second hybrid polypeptides are interacting, said interaction recovering the 

activity of the protein. In the selection methods above, the transcription or activating domain and the DNA-binding 
domain may be derived from Gal4 and LexA respectively. 

[0176] In the embodiment wherein the first domain is a first part of an enzyme and a complementary domain is a 
10 second part of the same enzyme, and wherein the proximity of the two parts of the enzyme restores the enzyme activity 
and activates a reporter gene, the two parts of the enzymes are most preferably the T25 and T18 polypeptides that 
form the catalytic domain of the Bordetella pertussis adenylate cyclase. 

[0177] As an illustrative embodiment, the reporter gene is chosen among the group consisting of a nutritional gene 
or also a gene the expression of which is visualised by colorimetry such as His3, LacZ or both LacZ and His3. 

15 

MARKER COMPOUNDS OF THE INVENTION 

[0178] The Selected Interacting Domain (SID®) polypeptides of SEQ ID N°1 to 38 of the invention and variants 
thereof defined in the present specification, and which bind specifically to a polypeptide of interest (e.g. a bait polypep- 

20 tide), are useful as reagents for detecting, labelling, targeting or purifying specifically a polypeptide of interest, typically 
a polypeptide encoded by HCV, within a sample, since the SID® polypeptides possess properties that have never been 
reached using conventional detection compounds, such as those of an antibody or an antibody fragment. 
[0179] Firstly, the SID® polypeptides of the invention possess a high specificity of binding to the polypeptide of 
interest, since a SID® polypeptide consists of a portion of a larger polypeptide which binds in a highly specific manner 

25 to the polypeptide of interest in the natural environment within the eukaryotic cell infected by the Hepatitis C virus. 
[0180] Secondly, the SID® polypeptide generally has a low molecular weight, generally from 3 kDa, and are thus 
easy to produce, on the one hand, and, on the other hand, can be easily introduced within a cell when the detection 
of the localisation or of the expression of the polypeptide of interest is sought. Moreover, the small size of a SID® 
polypeptide allows its passage through inner cell barriers such as the nucleus membrane, or the membranes surround- 

30 ing the different cell organ ites. 

[0181] Thus, a first object of the invention consists of a marker compound wherein said compound comprises : 

a) a Selected Interacting Domain (SID®) polypeptide of the invention or a variant thereof that binds specifically to 
the polypeptide of interest; and 
35 b) a detectable molecule bound thereto. 

[0182] Such a marker compound is primarily useful for detecting, labelling or targeting a polypeptide of interest, for 
example a polypeptide of interest contained in a sample. 

[0183] A detectable molecule according to the invention comprises, or alternatively consists of, any molecule which 
40 produces or can be induced to produce a signal. The detectable molecule can be a member of the signal producing 
system that includes the signal producing means . 

[01 84] The detectable molecule may be isotopic or non-isotopic. By way of example and not limitation, the detectable 
molecule can be part of a catalytic reaction system such as enzymes, enzyme fragments, enzyme substrates, enzyme 
inhibitors, co-enzymes, or catalysts. Part of a chromogen system such as fluorophores, dyes, chemiluminescers, lu- 
45 minescers, or sensitizers. A dispersible particle that can be nonmagnetic or magnetic, a solid support, a liposome, a 
ligand, a receptor, a hapten radioactive isotope, and soforth. 

[0185] It must be generally understood that the whole embodiments disclosed in the present specification involving 
a Selected Interacting Domain (SID®) polypeptide is straightfully applied also to any variant thereof. 

50 Fluorescent detectable molecules 

[0186] In one aspect of the marker compound according to the invention, the detectable molecule consists of a 
fluorescent molecule. Fluorescent moieties which are frequently used as labels are for example those described by 
Ichinose et al. (1991). Other fluorescent detectable molecules are fluorescing isothiocyanate (FITC) such as described 
55 by Shattil et al. (1987) or by Goding et al. (1986). The fluorescent detectable molecule may also comprise a phyco- 
erythrin as taught by Goding et al. (1 986), and Shattil et al. (1985). Other examples of fluorescent detectable molecules 
suitable for use as labels of a marker compound according to the invention are rhodamine isothiocyanate, dansyl 
chloride and XRITC. 
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[0187] Another fluorescent detectable molecule consists of the green fluorescent protein (GFP) of the jelly fish/\egi/o- 
rea victoria, and their numerous fluorescent protein derivatives. 

[0188] The one skilled in the art may advantageously refer to the articles of CHALFIE et al. (1994) and of HEIM et 
al. (1994) which discloses the uses of GFP for the study of gene expression and protein localisation. The one skilled 

5 in the art may also refer to the article of Rizzuto et al. (1995) , which discusses the use of wild-type GFP as a tool for 
visualising subcellular organelles in cells, to the article of KAETHER and GERDES (1995), which reports the visuali- 
sation of protein transport along the secretary passway using wild-type GFP, the article of HU and CHENG (1995), 
which relates to the expression of GFP in plant cells and also to the article of Davis et al. (1995) which discloses the 
GFP expression in drosophilia embryos. For the use of several fluorescent variants of GFP, the one skilled in the art 

10 may refer to the article of Delagrave et al. (1995), as well as to the article of Heim et al. (1995). DNA encoding GFP is 
available commercially, for example from Clontech in Palo Alto, California, USA. The one skilled in the art may use 
also humanized GFP genes such as those described in the US Patent N°6,020,192 and also the GFP protein disclosed 
in the US Patent N°5,941,084. 

[0189] Another fluorescent protein that may be used in a marker compound according to the invention consists of 
15 the yellow fluorescent protein (YFP). 

[0190] A further suitable luminescent protein consists of the luciferase protein. 

Detectable molecules exhibiting a catalytic activity 

20 [0191] In another embodiment of a detectable molecule included in a marker compound according to the invention, 
said detectable molecule is endowed with a catalytic activity and may thus consists of enzymes and catalytically active 
enzyme fragments. Some enzymatic labels are described in US Patent N°3, 654,090. Such enzymes may be for ex- 
ample horse radish peroxydase (HRP), alkaline phosphatase or glutathione peroxydase which are well known from 
the one skilled in the art. 

25 [0192] Enzymes, enzyme fragments, enzyme inhibitors, enzyme substrates, and other components of enzyme re- 
action systems can be used as detectable molecules. Where any of these components is used as a detectable molecule, 
a chemical reaction involving one of the components is part of the signal producing system. 

[0193] Coupled catalysts can also involve an enzyme with a non-enzymatic catalyst. The enzyme can produce a 
reactant, which undergoes a reaction catalysed by the non-enzymatic catalyst or the non-enzymatic catalyst may pro- 
30 duce a substrate (including co-enzymes) for the enzyme. The one skilled in the art may advantageously refer to the 
US Patent N°4,160 645 which disclose a white variety of non enzymatic catalysts, which may be employed, the ap- 
propriate portions of which are incorporated therein by reference. 

[01 94] The enzyme or co-enzyme employed provides the desired amplification by producing a product, which absorbs 
light, e.g., a tye, or emits lights upon irradiation, e.g., a fluoresces Alternatively, the catalytic reaction can lead to direct 
35 light emission, e.g., chemiluminescence. A large number of enzymes and co-enzymes for providing such products are 
described in the US Patents N°4,275,149, columns 19 to 23 and N°4,318,980, columns 10 to 14 which disclosures 
are incorporated herein by reference. 

[0195] A number of enzyme combinations are set forth in US Patent N°4,275,149, columns 23 to 28 which disclosures 
are incorporated herein by reference. 
40 [0196] When a single enzyme is used as the detectable molecule, or alternatively as comprised in the detectable 
molecule, such enzymes may find use are hydrolases, transferases, lyases, isomerases, ligases or synthetases and 
oxydoreductases. 

[0197] Alternatively, luciferases may be used such as firefly luciferase and bacterial luciferase. 
[0198] Primarily, the enzymes of choice, based on the I.U.B. classification are: (i) class 1. Oxydoreductases and (ii) 
45 class 3. Hydrolases. Most preferred oxydoreductases are (i) dehydrogenases of class 1.1, more particularly 1.1.1, 
1.1.3. and 1.1.99 and (ii) peroxydases in class 1.11. of the hydrolases, particularly class 3.1. , more particularly 3.1.3 
and class 3.2, more particularly 3.2.1. are preferred. 

[0199] Illustrative dehydrogenases include maiate dehydrogenase, glucose-6-phosphate dehydrogenase and lactate 
dehydrogenase. Of the oxydases, glucose oxydases is exemplary. Of the peroxydases, horse radish peroxydase is 
50 illustrative. Of the hydrolases, alkaline phosphatases, p-glucosydase and lysozyme are illustrative. 

Chemiluminescent detectable molecules 

[0200] The detectable molecule comprised within the marker compound according to the invention may also consist 
55 in a chemiluminescent moiety. The chemiluminescent source involves a compound, which becomes electronically ex- 
cited by a chemical reaction and may emit light which serves at as the detectable signal or donates energy to a fluo- 
rescent acceptor. 

[0201] A diverse number of families of compounds have been found to provide chemiluminescent under a variety of 
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conditions. When family of compounds is 2,3-dihydro-1 ,4-phta!azinedinone. The most utilised compound is luminol, 
which is the 5-amino analogue of the compound above. Other members of the family include the 5-amino-6,7,8-tri- 
methoxy-and the dimethylamine-[ca]benzo analogue. These compounds can be made to luminance with alkaline hy- 
drogen peroxyde or calcium hypochlorite and base. 
5 [0202] Another family of compounds is the 2,4,5-triphenylimidazoles, with lophine as the common name for the parent 
product. Chemiluminescent analogues include para-dimethylamino- and paramethoxy-substituents. Chemilumines- 
cents may also be obtained with geridinium esters, dioxetanes and oxalates, usually oxalyl active esters, e.g., p-nitro- 
phenyl and a peroxide, e.g., hydrogen peroxide, under basic conditions. Alternatively, luciferins may be used in con- 
junction with luciferase or lucigenins. 

10 

Radioactive detectable molecules 

[0203] In a further embodiment of a detectable molecule comprised in a marker compound according to the invention, 
said detectable molecule is radio-actively labelled such as with [ 3 H], p 2 P], [and [ 125 l]. 

15 

Colloidal metal detectable molecules 

[0204] In still a further embodiment, the detectable molecule comprised in a marker compound according to the 
invention may include a colloidal metal particle. Colloidal metals have been employed in immuno assays previously. 
20 Mostly, they consisted of either colloidal iron or gold. The one skilled in the art may advantageously refer to the articles 
of Horisberger (1981 ) and Martin et al. (1990). In other case, the metals are chosen for their colour, i.e., their presence 
is determined by their colour or electron density under an electron microscope. Both the colour and electron density 
are directly proportional to the mass of the metal colloid. 

25 STRUCTURE OF THE MARKER COMPOUNDS OF THE INVENTION 

[0205] In a first preferred embodiment of a marker compound of the invention, the detectable molecule is covalently 
bound to the Selected Interacting Domain (SID®) polypeptide of SEQ ID N°1 to SEQ ID N°38 or a variant thereof. 
[0206] According to this specific embodiment, detectable molecules comprising fluorescent proteins such as GFP 

30 and YFP, enzymes or enzyme fragments such as alkaline phosphatase, glutathione peroxydase and horse radish 
peroxydase, chemiluminescent molecules, radioactive labels or colloidal metal particles will be preferred. 
[0207] General methods that may be used by the one skilled in the art for covalently binding the detectable molecules 
to the Selected Interacting Domain (SID®) polypeptide are described in the numerous bibliographic references related 
to the preparation of the antibody conjugates used for carrying out immunoassays. 

35 [0208] In a second preferred embodiment of a marker compound according to the invention, the detectable molecule 
is non-covalently bound to the Selected Interacting Domain (SID®) polypeptide or a variant thereof. 
[0209] In a first preferred aspect of this second preferred embodiment, the detectable molecule consists of an anti- 
body directed specifically against the Selected Interacting Domain (SID®) polypeptide or a variant thereof. 
[0210] The antibodies directed specifically against the Selected Interacting Domain (SID®) polypeptide or a variant 

40 thereof may be indifferently radioactivity or non radioactivity labelled. 

NUCLEIC ACIDS ENCODING A MARKER COMPOUND OF THE INVENTION. 

[0211] The present invention also relates to a nucleic acid encoding a marker compound as defined above. 
45 [0212] Most preferred nucleic acids encompassed by the invention include polynucleotides that encode a marker 
compound wherein the Selected Interacting Domain (SID®) polypeptide of SEQ ID N°1 to 38 or a variant thereof is 
covalently bound to the detectable molecule and wherein the detectable molecule consists itself of a polypeptide. 
[0213] Most preferred nucleic acids are those of SEQ ID N°39 to 76. 

[0214] In a first preferred embodiment of a nucleic acid according to the invention, said nucleic acid encodes for a 
50 Selected Interacting Domain (SID®) polypeptide which is fused to a fluorescent protein, such as GFP and YFP. 

[021 5] In a second preferred embodiment of a nucleic acid according to the invention, said nucleic acid encodes for 
a Selected Interacting Domain (SID®) polypeptide which is fused to a polypeptide endowed with a catalytic activity, 
such as an enzyme or an enzymatically active enzyme fragment, like alkaline phosphatase, glutathione peroxydase 
and horse radish peroxydase. 

55 [0216] In a preferred embodiment, a nucleic acid encoding a marker compound of the invention comprises a DNA 
coding sequence which is transcribed and translated into said marker compound in a cell in vitro or in vivo when placed 
under the control of appropriate regulatory sequences. The boundaries of the coding sequence are determined by a 
start codon and a translation stop codon. A coding sequence can include, but is not limited to: 
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prokaryotic sequences, for example when the Selected Interacting Domain (SID®) nucleic acid and the nucleic 
acid fused thereto which encodes the detectable molecule are of prokaryotic origin; 

prokaryotic and eukaryotic sequences, for example the nucleic acid encoding the detectable molecule originates 
from an eukaryotic host organism. 

5 

[0217] If the coding sequence is intended for expression in an eukaryotic cell, a polyadenylation signal and transcrip- 
tion termination sequence will usually be located 3' to the coding sequence. 

[0218] In a most preferred embodiment of a nucleic acid sequence according to the invention, said nucleic acid 
sequence include a regulatory region which is functional in the host organism within which the expression of said 

10 nucleic acid sequence is sought, wherein said regulatory region comprises a promoter sequence. 

[0219] "Regulatory region" means a nucleic acid sequence which regulates the expression of a nucleic acid. A reg- 
ulatory region may include sequences which are naturally responsible for expressing a particular nucleic acid (a ho- 
mologous region), or may include sequences of a different origin (responsible for expressing different proteins or even 
synthetic proteins). In particular, the sequences can be sequences of eukaryotic or viral genes or derived sequences 

15 which stimulate or repress transcription of a gene in a specific or non-specific manner and in an inducible or non- 
inducible manner. Regulatory regions include origins of replication, RNA splice sites, enhancers, transcriptional termi- 
nation sequences, signal sequences which direct the polypeptide into the secretary pathways of the target cell, and 
promoters. 

[0220] A "promoter sequence" is a DNA regulatory region capable of binding RNA polymerase in a cell and initiating 
20 ' transcription of a downstream (3' direction) coding sequence. For purposes of defining the present invention, the pro- 
moter sequence is bounded at its 3' terminus by the transcription initiation site and extends upstream (5' direction) to 
include the minimum number of bases or elements necessary to initiate transcription at levels detectable above back- 
ground. Within the promoter sequence will be found a transcription initiation site (conveniently defined for example, by 
mapping with nuclease S1), as well as protein binding domains (consensus sequences) responsible for the binding of 
25 RNA polymerase. 

[0221] A coding sequence is "under the control" of transcriptional and translational control sequences in a cell when 
RNA polymerase transcribes the coding sequence into mRNA, which is then trans-RNA spliced and translated into the 
protein encoded by the coding sequence. 

30 Most preferred vectors for the expression of a marker compound of the invention. 

[0222] Most preferred recombinant vectors for expressing a marker compound of the invention include pASAA (figure 
2), pACTIIst (figure 3), pT18 (figure 4), pUT18C (figure 5), pT25 (figure 6), pKT25 (figure 7), pB5 (Figure 12) and pP6 
(Figure 13) containing inserted therein a nucleic acid encoding a Selected Interacting Domain (SID®) polypeptide as 
35 defined above or a variant thereof. 

[0223] The invention also pertains to recombinant host cells transformed with a vector expressing a marker compound 
as defined above, more particularly a vector comprising inserted therein a nucleic acid encoding said marker compound, 
which is operably linked to suitable regulation signals which are functional in the host cell wherein its expression is 
sought. 

40 [0224] Preferred cells for expression purposes will be selected in function of the objective which is sought. For ex- 
. ample, in the embodiment wherein the production of a marker compound according to the invention in large quantities 
is sought, the nature of the cell host used for its production is relatively indifferent, provided that large amounts of 
Selected Interacting Domain (SID®) polypeptides or marker compounds of the invention are produced and that optional 
further purification steps may be carried out easily. 

45 [0225] However, in the embodiment wherein the marker compound is recombinantly produced within a host organism 
for the purpose of qualitative or quantitative analysis of the polypeptide of interest onto which said marker compound 
specifically binds, then the host organism is selected among the host organisms which are suspected to produce 
naturally said polypeptide of interest. 

[0226] Consequently, mammalian and human cells, as well as bacterial, yeast, fungal, insect, nematode and plant 
50 cells are cell host encompassed by the invention and which may be transfected either by a nucleic acid or a recombinant 
vector as defined above. 

DETECTION METHODS OF THE INVENTION 

55 [0227] The present invention further relates to the use of a Selected Interacting (SID®) polypeptide of SEQ ID N°1 
to 38 or a variant thereof as well as a nucleic acid encoding it for detection purposes such as nucleic acids of SEQ ID 
N°39 to 76. It is herein reminded that a Selected Interacting Domain (SID®) polypeptide is determined according to 
the ability of such a (SID®) polypeptide to bind in a highly specific manner to a given (e.g. bait) polypeptide of interest, 
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since the aminoacid sequence of a SID® polypeptide is encoded by a nucleic acid, the nucleotide sequence of which 
v consists of the polynucleotide sequence which is common to a collection of nucleic acid sequences encoding prey 
polypeptides that have been selected for their specific binding properties to a (bait) polypeptide of interest, such as 
explained above in the section entitled " SELECTED INTERACTING DOMAIN (SID®) POLYPEPTIDES". 

5 [0228] The specific properties of a Selected Interacting Domain (SID®) polypeptide for binding to a given polypeptide 
of interest, either a viral, yeast, fungal, bacterial, insect, plant or mammal polypeptide, including a polypeptide of human 
origin, allow its use as a specific ligand for said polypeptide of interest of which the detection is sought 
[0229] Therefore, the use of a Selected Interacting Domain (SID®) in any detection method known in the art and 
which makes use of the ability of a detection ligand to bind specifically to a molecule of interest, most preferably a 

10 polypeptide of interest, fall under the scope of the present invention. 

[0230] Detection methods that make use of the recognition of a molecule of interest, most preferably a polypeptide 
of interest, by a detection ligand are well known in the art and are primarily illustrated by the abundant literature that 
relate to immunoassays, which is incorporated herein by reference in its entirety. 

[0231] The one skilled in the art may particularly refer to the book of Maggio (1980) (Heterogeneous assays), the 
15 US Patent N°3,81 7,837 (homogeneous Immunoassays), US Patent N° 3,993,345 (Immunofiuorescense methods), 
US Patent N°4,233,402 (enzyme channelling techniques), US Patent N°3,81 7,837 (Enzyme multiplied immunoassay 
technique), US Patent N°4,366,241 and European Patent Application N°EP-A 0 143 574 (Migration type assays), US 
Patent N°5,202,006, US Patent N°5,120,413 and US Patent N°5,145,567 (Immunofixation electrophoresis, Immunoe- 
lectrophoresis), the article of Aguzzi et al. (1977), the article of White et al. (1986), the article of Merlini et at. (1983), 
20 the US Patent n°5,228,960 (Immunosubstraction electrophoresis), the articles of Chen et al. (1991), Nielsen et al. 
(1991) and the US Patent n° 5,120,413 (Capillary electrophoresis). 

Acellular detection method of the invention. 

25 [0232] A first detection method of the invention consists of a method for detecting a polypeptide of interest within a 
sample, wherein said method comprises the steps of: 

a) contacting a marker compound or a plurality of marker compounds according to the invention with the sample 
which is suspected to contain the polypeptide of interest the detection of which is sought; 
30 b) detecting the complexes formed between said marker compound or said plurality of marker compounds and 

said polypeptide of interest. 

[0233] The sample which is assayed for the presence of the polypeptide of interest the detection of which is sought 
may be of any nature , including every sample that may be used for carrying out an immunoassay. 
35 [0234] In a first aspect, the sample may be any biological fluid, such as blood or blood separation products (e.g. 
serum, plasma, buffy coat), urine, saliva, tears. 

[0235] In a second aspect, the sample may be any isolated biological tissue sample, including tissue sections pre- 
viously fixed for purposes of histological studies. 

[0236] In a third aspect, the sample may be a culture supernatant of a cell culture and a cell lysate of cultured cells. 
40 [0237] In a first preferred embodiment of the first detection method of the invention described above, the detection 
step b) consists of the measure of the fluorescence signal intrinsically emitted by the detectable molecule. It may for 
exampole be taken the advantage of SID® polypeptides or variants thereof having in their aminoacid sequence one 
or several tryptophan aminoacid residues. 

[0238] In a second preferred embodiment of the first detection method of the invention detailed above, the detection 
45 step b) consists of submitting the detectable molecule to a source of energy at the excitation wavelength of said de- 
tectable molecule, and measuring the light emitted at the emission wavelength of said detectable molecule. 
[0239] An illustrative example of this second embodiment above is when the marker compound used consists of a 
Selected Interacting Domain (SID®) which is bound to a fluorescent molecule, such as the fluorescent proteins GFP 
orYFP. 

50 [0240] For example, in the embodiment wherein the detectable molecule of the marker compound of the invention 
which is used according to the first detection method above comprises, or alternatively consists of, a GFP protein, the 
detection step c) includes illuminating the sample tested at an emission wavelength substantially equal to 490 nm, and 
measuring the light emitted by the marker compound which is bound to the polypeptide of interest within the sample 
at an emission wavelength substantially equal to 510 nm. 

55 [0241] Preferably, the marker compounds which are not bound to the polypeptide of interest the detection of which 
is sought within the sample are removed before carrying out the detection step. 

[0242] In a third preferred embodiment, the detection step c) of the first detection method of the invention consists 
of measuring the catalytic activity of the detectable molecule. In this specific embodiment, the marker compound used 
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in the detection method comprises a detectable molecule which comprises, or alternatively which consists of, an en- 
zyme or a catalytically active enzyme fragment, such as already detailed in the section entitled " Marker compounds 
of the invention ". 

[0243] In a fourth preferred embodiment, the detection step b) consists of measuring the radioactivity emitted by the 
5 detectable molecule. 

[0244] The present invention further relates to a kit for detecting a polypeptide of interest within a sample, wherein 
said kit comprises a marker compound according to the invention. 

[0245] Optionally, said detection kit further comprises the reagents necessary for carrying out the detection step b), 
such as a suitable substrate for the particular enzyme or a catalytically active enzyme fragment used, as well as suitable 
10 buffer solutions, which may be identical to those conventionally used for performing immunoassays. 

Cellular detection assay using a recombinantly produced marker compound of the invention. 

[0246] As already described above, any marker compound according to the invention may be produced according 
15 to genetic engineering techniques. Particularly, nucleic acid encoding a particular marker compound which binds spe- 
cifically to a polypeptide of interest the detection of which is sought may be inserted in a vector, wherein said vector 
may be used to transfect or transform a host organism, either a prokaryotic or an eukaryotic cell host such as defined 
above. 

[0247] In this specific embodiment, the production of a recombinant marker compound of the invention is allowed 
20 within such a transfected or transformed host cell. Once the host cell of interest is transfected or transformed with such 
a recombinant vector and once the recombinant marker compound is produced within the cell host of interest, then 
the Selected Interacting Domain (SID®) polypeptide portion of said marker compound will be able to bind specifically 
to its specific target polypeptide within the cell host. In this situation, the recombinantly produced marker compound 
of the invention will predominantly be localised at cell sites wherein the targeted polypeptide of interest is present. 
25 [0248] This is the purpose of the second detection method of the invention which is detailed below. 

[0249] A further object of the invention consists of a method for detecting a polypeptide of interest within a prokaryotic 
or an eukaryotic cell host, wherein said method comprises the steps of: 

a) providing a cell host to be assayed; 
30 b)transfecting said cell host with a nucleic acid encoding a marker compound of the invention, or with a recombinant 

vector encoding a marker compound of the invention; 

c) detecting the complexes formed between the marker compound expressed by the transfected cell host and the 
polypeptide of interest. 

35 [0250] Because the Selected Interacting Domain (SID®) polypeptide which is part of a marker compound of the 
invention specifically binds to a polypeptide which is suspected to be naturally produced by the targeted cell host, the 
second detection method of the invention defined above allows a qualitative as well as a quantitative detection of this 
targeted polypeptide which is suspected to be naturally produced by the transfected target cell host under assay. 
[0251] For example, in the embodiment within which the procedure for selecting the Selected Interacting Domain 

40 (SID®) polypeptide which is part of a marker compound of the invention includes a first step wherein a collection of 
clones containing nucleic acid inserts derived from a H77 strain HCV genomic DNA library is prepared, the transfection 
of a mammalian cell, preferably a human cell, with a vector encoding such a marker compound of the invention will 
allow to detect the expression of a human polypeptide naturally expressed within said mammalian host cell and which 
naturally interacts with the HCV viral protein from which is derived the Selected Interacting Domain (SID®) polypeptide. 

45 [0252] The second detection method of the invention defined above firstly allows the qualitative detection of the 
targeted polypeptide of interest which binds specifically with the recombinantly produced marker compound of the 
invention, and thus permits to know in which environmental conditions or at which differentiation stage the targeted 
polypeptide of interest is naturally produced within the cell host transfected with a vector expressing a marker compound 
of the invention. 

so [0253] Secondly, this second detection method of the invention allows the localisation of the targeted polypeptide of 
interest within the interior of the cell, including localisation in the plasma membrane, cytosol, nucleus and any organelle 
such as ribosomes, Golgi apparatus, lysosomes, phagosomes, endoplasmic reticulum and chloroplasts. 
[0254] The localisation of a targeted polypeptide of interest which is expressed within the cell host under assay 
according to the second detection method of the invention may be carried out by any means well known in the art, 

55 including using a confocal microscope. 

[0255] Thirdly, the second detection method of the invention allows also a quantitative analysis of the expression of 
the targeted polypeptide of interest within the cell host under assay, since the level of the detection signal produced 
by the detectable molecule which is part of the marker compound will be proportional to the number of complexes 
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formed between the cell host under assay between the targeted polypeptide of interest and the recombinantly produced 
marker compound of the invention. 

[0256] Essentially, the one skilled in the art may refer to the section entitled " Acellular detection method of the 
invention " above to find the teachings necessary for performing the detection step c) of the second detection method 
5 described herein. 

[0257] In a first embodiment of said second detection method of the invention, the detection step c) consists of the 
measure of the fluorescence signal intrinsically emitted by the detectable molecule comprised in the recombinantly 
expressed marker compound of the invention. 

[0258] In a second preferred embodiment of the second detection method above, the detection step c) consists of 
10 submitting the detectable molecule to a source of energy at the excitation wavelength of said detectable molecule and 
measuring the light emitted at the emission wavelength of said detectable molecule. 

[0259] In still a further embodiment of the second detection method of the invention, the detection step c) consists 
of measuring the catalytic activity of the detectable molecule. 

[0260] In another embodiment, the detection step c) consists of measuring the radioactivity emitted by the detectable 
15 molecule. 

[0261] In yet a further embodiment of the second detection method of the invention, the detection step c) allows the 
location of the complexes formed between the recombinantly produced marker compound and the targeted polypeptide 
of interest within the transfected cell host. 

[0262] A further object of the invention consists of a kit for detecting a polypeptide of interest within a prokaryotic or 
20 an eukaryotic cell host, wherein said kit comprises a nucleic acid encoding a marker compound as defined herein, or 
a recombinant vector containing inserted therein a nucleic acid encoding a marker compound of the invention. 
[0263] Optionally, the detection kit above may further comprise the reagents necessary to carry out the detection 
step c). 

25 Cellular detection method of the invention using a marker compound which is introduced within a cell host. 

[0264] There is a third detection method according to the invention wherein the marker compound comprising a 
Selected Interacting Domain (SID®) polypeptide OF SEQ ID N°1 to 38 or a variant thereof is previously produced by 
any means and subsequently introduced into a target cell host for the purpose of detecting a targeted polypeptide of 
30 interest which binds specifically with said Selected Interacting Domain (SID®) polypeptide. 

[0265] Thus, the invention further relates to a method for detecting a polypeptide of interest within a prokaryotic or 
an eukaryotic cell host, wherein said method comprises the step of: 

a) providing a cell host to be assayed; 
35 b) introducing a marker compound as defined herein within said cell host; 

and 

c) detecting the complexes formed between the marker compound and the polypeptide of interest within the cell 
40 host. 

[0266] Taking into account the low molecular weight of the Selected Interacting Domain (SID®) polypeptide selected 
from SEQ ID N°1 to 38 which is part of a marker compound of the invention, when compared with conventional specific 
detection molecules such as antibodies or antibody fragments, it results that the introduction of a marker compound 

45 of the invention into the interior of a target cell host will be much more easier to perform, as compared with the intro- 
duction within a cell host of a conventional marker like a labelled antibody or a labelled antibody fragment. 
[0267] According to the third detection method of the invention defined above, step b) of introducing the marker 
compound within the target cell host may be performed by any technique well known in the art, including electroporation, 
and the use of molecules that will facilitate the passage of the marker compound of the invention through the cell 

50 membranes, and typically the plasma membrane. 

[0268] Such molecules that facilitate the passage of a marker compound of the invention through cell membranes 
include, but are not limited to, penetratin, like penetratin 1.RTM (Encor, Gaithersburg, Md), Antenna Pediae protein, 
cationic lipids and cationic polyacrylates. 

[0269] Permeation enhancers which may be employed include bile salts such as sodium glycocholate and other 
55 molecules such as fJ-cyclodextrin. Bile salts are known to increase the absorption of macromolecules across mem- 
branes (Pontiroli et al., 1987). 

[0270] As already detailed for the second detection method of the invention described in the previous section, the 
third detection method of the invention allows also the localisation of the targeted polypeptide of interest which is 



23 



EP 1 178 116 A9 (W1A1) 

expressed by the cell host under assay, as well as the qualitative and quantitative analysis of the expression of said 
target polypeptide of interest. 

[0271] The detection step c) according to the third detection method of the invention described above may be carried 
out in the same way than the detection step c) of anyone of the first detection method and the second detection method 
5 detailed in the previous sections herein. 

[0272] In a first embodiment of the third detection method above, the detection step c) consists of the measure of 
the fluorescence signal intrinsically emitted by the detectable molecule. 

[0273] In a second embodiment, the detection step c) consists of submitting the detectable molecule to a source of 
energy at the excitation wavelength of said detectable molecule and measuring the light emitted at the emission wave- 
10 length of said detectable molecule. 

[0274] In a third embodiment, the detection step c) consists of measuring the catalytic activity of the detectable 
molecule. 

[0275] In a fourth embodiment, the detection step c) consists of measuring the radioactivity emitted by the detectable 
molecule. 

15 [0276] In a fifth embodiment of the third detection method of the invention, the detection step c) allows the location 
of the complexes formed between the marker compound and the polypeptide of interest within the target cell host under 
assay. 

[0277] A further object of the invention consists of a kit for detecting a polypeptide of interest within a prokaryotic or 
an eukaryotic cell host, wherein said kit comprises a marker compound as defined herein. 
20 [0278] The detection kit above may further comprise the reagents necessary to carry out the detection step c). 

[0279] The detection kit above may also further comprise the reagents necessary to facilitate the introduction of the 
marker compound within the target cell host under assay. 

SOLID PHASE DETECTION METHOD USING A SELECTED INTERACTING DOMAIN (SID©) POLYPEPTIDE. 

25 

[0280] In a further aspect of the invention, the use of a Selected Interacting Domain (SID®) polypeptide of SEQ ID 
N°1 to 38or a variant thereof for detection purpose include a step wherein said Selected Interacting Domain (SID®) 
polypeptide is immobilised on a suitable substrate before bringing a sample to be assayed in contact with the substrate 
onto which said Selected Interacting Domain (SID®) polypeptide has been previously immobilised. 
30 [0281] A subsequent step will consist in detecting the complexes formed between the Selected Interacting Domain 
(SID®) polypeptide immobilised on the substrate and the targeted polypeptide of interest the presence of which is 
suspected in the sample assayed. 

[0282] Thus, the invention also pertains to a fourth detection method which consists of a method for detecting a 
polypeptide or a plurality of polypeptides of interest within a sample, wherein said method comprises the steps of: 

35 

a) providing a substrate onto which a Selected Interacting Domain (SID®) polypeptide or a plurality of Selected 
Interacting Domain (SID®) polypeptides is (are) immobilised; 

b) bringing into contact the substrate defined in a) with the sample to be assayed; 

c) detecting the complexes formed between the Selected Interacting Domain (SID®) polypeptide or the plurality 
40 of Selected Interacting Domain (SID®) polypeptides and the target polypeptide or the plurality of target polypeptides 

contained in the sample. 

[0283] Substrates, supports or surfaces for immobilising protein molecules are well known in the art, and a lot of 
them have been described for performing solid phase immunoassays. 
45 [0284] Preferably, a plurality of Selected Interacting Domain (SID®) polypeptides of different aminoacid sequences 
choosen among the sequences SEQ ID N°1 to 38 are immobilised on the substrate used according to the fourth 
detection method of the invention. 

[0285] For example, a complete collection of Selected Interacting Domain (SID®) polypeptides which have been 
determined according to the methods described in the section entitled " Selected Interacting Domain (SID®) polypep- 
50 tides" above, using nucleic acids derived from the H77 strain HCV genomic DNA as starting material, may be used for 
being immobilised on a suitable substrate. 

[0286] According to this embodiment, the collection of Selected Interacting Domain (SID®) polypeptides of SEQ ID 
N°1 to 38 are immobilised on the substrate in another manner, thus forming an ordered area of SID® polypeptides 
immobilised at known locations of the surface of said substrate. 
55 [0287] The substrate, support or surface may be a porous or a nonporous water insoluble material. The support can 
be hydrophilic or capable of being rendered hydrophilic and includes inorganic powders such as silica, magnesium 
sulphate, and alumina; natural polymeric materials, particularly cellulosic materials and materials derived from cellu- 
lose, such as fiber containing papers; synthetic or modified naturally occurring polymers, such as nitro-cellulose, cel- 
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lulose acetate, polyvinyl chloride), polyacrylamide , cross-linked dextran, agarose, polyacrylate, polyethylene, poly- 
propylene, poly(4-methylbutene), polystyrene, polymethacrylate, poly(ethylene terephtalate), nylon, polyvinyl bu- 
tyrate), said materials being used by themselves or in conjunction with other materials; glass available as Bioglass, 
ceramic metals and the like. 

5 [0288] An ordered area onto which a plurality of Selected Interacting Domain (SID®) polypeptides are immobilised 
may be manufactured according to the techniques disclosed in the US Patent N°5,143,854 or the PCT Application 
n°WO 92/10092, incorporated herein by reference for all purposes. The combination of photolithographic and fabrica- 
tion techniques may, for example, enable each Selected Interacting Domain (SID®) polypeptide to occupy a very small 
area (" site ") on the support. In some embodiments, the site may be as small as few microns or even a single Selected 

10 Interacting Domain (SID®) polypeptide. 

[0289] In a first embodiment of the fourth detection method detailed above, the plurality of Selected Interacting Do- 
main (SID®) polypeptides are immobilized on the substrate in an order manner. 

[0290] In a second embodiment of Selected Interacting Domain (SID®), the Selected Interacting Domain (SID®) 
polypeptide or the plurality of Selected Interacting Domain (SID®) polypeptides are covalently bound to the substrate. 

15 [0291] In a third embodiment of said method, the Selected Interacting Domain (SID®) polypeptide or the plurality of 
Selected Interacting Domain (SID®) polypeptides are non-covalently bound to the substrate. According to this specific 
embodiment, the Selected Interacting Domain (SID®) polypeptide or the plurality of Selected Interacting Domain (SID®) 
polypeptides are covalently bound to a first ligand molecule and the substrate is coated with a second ligand molecule, 
wherein said second ligand molecule specifically binds to the first ligand molecule. According to such a specific em- 

20 bodiment, the first ligand may be biotin in which case the second ligand is most preferably streptavidin. 

[0292] In still a further embodiment of the fourth detection method according to the invention, the Selected Interacting 
Domain (SID®) polypeptide or the plurality of Selected Interacting Domain (SID®) polypeptides are covalently linked 
to a spacer, which spacer is itself also covalently bound to the substrate in order to immobilise the Selected Interacting 
Domain (SID®) polypeptide or the plurality of Selected Interacting Domain (SID®) polypeptides onto said substrate. 

25 Such a spacer may be a peptide polymer such as a poly-alanineora poly-lysine peptide of 10 to 15 amino acids in length. 
[0293] In still a further embodiment of the fourth detection method above, the detection step c) consists of detecting 
changes in the optical characteristics of the substrate onto which the Selected Interacting Domain (SID®) polypeptide 
or the plurality of Selected Interacting Domain (SID®) polypeptides are bound. 

[0294] In yet a further embodiment of the fourth detection method of the invention, the detection step c) consists of 
30 bringing into contact the substrate wherein complexes are formed between the targeted polypeptide molecule contained 
in the sample assayed and the Selected Interacting Domain (SID®) polypeptide or the plurality of Selected Interacting 
Domain (SID®) polypeptides bound to said support, with a detectable molecule having the ability to bind to such com- 
plexes. 

[0295] A further object of the invention consists of a device or an apparatus for the detection of a polypeptide or a 
35 plurality of polypeptides of interest within a sample, wherein said device or apparatus comprises a substrate onto which 
a Selected Interacting Domain (SID®) polypeptide (or a plurality of Selected Interacting Domain (SID®) polypeptides) 
is (are) immobilised. 

[0296] Such a device or apparatus of the invention above may comprise or consist of a suitable substrate onto which 
the plurality of Selected Interacting Domain (SID®) polypeptides are arranged in an ordered manner, thus forming an 
40 area such as described above. 

PHARMACEUTICAL COMPOSITIONS CONTAINING A SELECTED INTERACTING DOMAIN (SID®) 
POLYPEPTIDE. 

45 [0297] It results from the method according to which a Selected Interacting Domain (SID®) polypeptide of SEQ ID 
N°1 to 38 has been selected and characterized that such a Selected Interacting Domain (SID®) polypeptide or a variant 
thereof is both: 

(i) endowed with highly specific binding properties to a (bait) polypeptide of interest; 

50 

and 

(ii) devoided of the biological activity of the naturally occurring protein from which this Selected Interacting Domain 
(SID®) polypeptide or a variant thereof is derived. 

55 

[0298] These original properties of a Selected Interacting Domain (SID®) polypeptide of SEQ ID N°1 to 38 or a 
variant thereof allow its use for interfering with a naturally occurring interaction between a first protein and a second 
protein within the cell of an organism by the binding of said Selected Interacting Domain (SID®) polypeptide specifically 
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either to said first polypeptide or said second polypeptide. 

[0299] The (SID®) polypeptides of the invention or variants thereof are capable of interfering with the in vivo protein- 
protein interactions between HCV proteins or between a HCV protein and a protein from the organism which has been 
infected with the Hepatitis C virus. 
5 [0300] For example the SID® polypeptide of SEQ ID N°2 interferes with the naturally occurring interaction between 
the core and the NS3 protein HCV. Similarly, the SID® polypeptide of SEQ ID N°17 interferes with the interaction 
between the NS4A and the NS4B proteins (see table 1 ). 

[0301] Thus, another object of the invention consists of a pharmaceutical composition comprising a pharmaceutical^ 
effective amount of a Selected Interacting Domain (SID®) polypeptide or a variant thereof. 
w [0302] The invention also relates to a pharmaceutical composition comprising a pharmaceutically effective amount 
of a nucleic acid comprising a polynucleotide encoding a Selected Interacting Domain (SID®) polypeptide of SEQ ID 
N°1 to 38 or a variant thereof which polynucleotide is placed under the control of an appropriate regulatory sequence. 
[0303] Preferred nucleic acids are the nucleotide sequences SEQ ID N°39 to 76. 

[0304] The invention also pertains to a pharmaceutical composition comprising a pharmaceutically effective amount 
15 of a recombinant expression vector comprising a polynucleotide encoding the Selected Interacting Domain (SID®) 
polypeptide or a variant thereof. 

[0305] The invention also pertains to a method for preventing or curing a viral infection by a hepatitis C virus in a 
human or an animal, wherein said method comprises a step of administering to the human or animal body a pharma- 
ceutically effective amount of a Selected Interacting Domain (SID®) polypeptide of SEQ ID N°1 to 38 or a variant 
20 thereof which binds to a targeted viral or mammal, typically- human protein. 

[0306] A pharmaceutical composition as described above, wherein said composition is administered by any route, 
such as intravenous route, intramuscular route, oral route, or mucosal route with an acceptable physiological carrier 
and/or adjuvant, also forms part of the invention. 

[0307] The Selected Interacting Domain (SID®) polypeptide or a variant thereof as a medicament for the prevention 

25 and/or treatment of pathologies induced by HCV are the most preferred. 

[0308] The Selected Interacting Domain (SID®) polypeptides of SEQ ID N°1 to 38 as active ingredients of a phar- 
maceutical composition will be preferably in a soluble form combined with a pharmaceutically acceptable vehicle. 
[0309] Such compounds which can be used in a pharmaceutical composition offer a new approach for preventing 
and/or treating pathologies linked to infection by HCV. Preferably, these compounds will be administered by the sys- 

30 temic route, in particular by the intravenous route, by the intramuscular or intradermal route or by the oral route. 

[031 0] Their modes of administration, optimum dosages and galenic forms can be determined according to the criteria 
generally taken into account in establishing a treatment suited to a patient, such as for example the age or body weight 
of the patient, the seriousness of his general condition, the tolerance to treatment and the side effects observed, and 
the like. 

35 [0311] The identified compound can be administered to a mammal, including a human patient, alone or in pharma- 
ceutical compositions where they are mixed with suitable carriers or excipients at therapeutically effective doses to 
treat disorders associated with prokaryotic micro-organism infection. Techniques for formulation and administration of 
the compounds of the invention may be found in n Remington's Pharmaceutical Sciences " Mack Publication Co., Eas- 
ton, PA, latest edition. 

40 [0312] For any Selected Interacting Domain (SID®) polypeptide or any variant thereof used according to the inven- 
tion, the therapeutically effective dose can be estimated initially from cell culture assays. For example, a dose can be 
formulated in animal models to achieve a circulating concentration range that includes or encompasses a concentration 
point or range shown the desired effect in an in vitro system. Such information can be used to more accurately determine 
useful doses in humans. 

45 [0313] A therapeutically effective dose refers to that amount of the compound that results in amelioration of symptoms 
in a patient. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical pro- 
cedures in cell cultures or experimental animals, e.g. for determining the LD50, (the dose lethal to 50% of the test 
population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic 
and therapeutic effects is the therapeutic index and it can be expressed as the ratio between LD50 and ED50 Com- 

50 pounds which exhibit high therapeutic indices are preferred. 

[0314] The data obtained from these cell culture assays and animal studies can be used in formulating a range of 
dosage for use in human. The dosage of such compounds lies preferably within a range of circulating concentrations 
that include the ED50, with little or no toxicity. The dosage may vary within this range depending upon the dosage form 
employed and the route of administration utilised. The exact formulation, route of administration and dosage can be 

55 chosen by the individual physician in view of the patient's condition. (See, e.g. Fingl et al. 1 975, in " The Pharmacological 
Basis of Therapeutics CH.I). 

[0315] Dosage amount and interval may be adjusted individually to provide plasma levels of the active compound 
which are sufficient to maintain the modulating effects. Dosages necessary to achieve the modulating effect will depend 
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on individual characteristics and route of administration. 

[0316] The amount of composition administered will, of course, be dependent on the subject being treated, on the 
subject's weight, the severity of the affliction, the manner of administration and the judgement of the prescribing phy- 
sician. 

[0317] The invention also pertains to a method for preventing or curing a viral in a human or an animal, wherein said 
method comprises the step of administering to the human or animal body a pharmaceutically effective amount of a 
nucleic acid comprising a polynucleotide encoding a Selected Interacting Domain (SD®) polypeptide of SEQ ID N°1 
to 38, or a variant thereof, and wherein said polynucleotide is placed under the control of a regulatory sequence which 
is functional in said human or said animal. 

[0318] Preferred polynucleotides are the nucleic acids of SEQ ID N°39 to 76. 

[0319] The invention also relates to a method for preventing or curing a viral or in a human or an animal, wherein 
said method comprises the step of administering to the human or animal body a pharmaceutically effective amount of 
a recombinant expression vector comprising a polynucleotide encoding a Selected Interacting Domain (SD®) polypep- 
tide which binds to a viral or bacterial protein. 

[0320] Other characteristics and advantages of the invention appear in the remainder of the description with the 
examples below, without linking the invention in any manner. 

EXAMPLES: 

Preparation of a HCV genomic collection. 

1.A. Collection preparation and transformation in Escherichia coli 
1.A.1 Fragmentation of genomic DNA preparation. 

[0321] The genomic DNA of the infectious HCV strain H77 (Yanagi et al., P.N.A.S. 1997, 94, 8738-43) is fragmented 
in a nebulizer (GATC) for 2 minutes at a pressure of 2 bars, precipitated and resuspended in water. 
[0322] The obtained nubilized genomic DNA is successively treated with Mung Bean Nuclease (Biolabs) (30 minutes 
at 30°C), T4 DNA polymerase (Biolabs) (10 minutes at 37°C) and Klenow enzyme (Pharmacia) (10 minutes at room 
temperature and 1 hour at 16°C). 

[0323] DNA is then extracted, precipitated and resuspended in water. 
1.A.2. Ligation of linkers to blunt-ended genomic DNA 

[0324] Oligonucleotide HGX931 (5' end phosphorylated) 1 u.g/fil and HGX932 1u.g/uJ. 
[0325] Sequence of the oligo HGX931: 5 f -GGGCCACGAA-3' (SEQ ID N°151). 
[0326] Sequence of the oligo HGX932: 5'-TTCGTGGCCCCTG-3'(SEQ ID N°152). 

[0327] Linkers were preincubated (5 minutes at 95°C, 10 minutes at 68°C, 15 minutes at 42°C) then cooled down 
at room temperature and ligated with genomic DNA inserts at 16°C overnight. 

[0328] Linkers were further removed on a separation column (Chromaspin TE 400, Clontech), according to the man- 
ufacturer's protocol. 

1.A.3. Vector preparation 

[0329] Plasmid pP6 (see figure 13) was prepared by replacing the Spe1/Xho1 fragment of pGAD3S2X with the 
double-stranded oligonucleotide: 

5'CTAGCCATGGCCGCAGGGGCCGCGGCCGCACTAGTGGGGATCCTTAATTAAAG 
GGCCACTGGGGCCCCCCGTACCGGCGTCCCCGGCGCCGGCGTGATCACCCCTA 
GGAATTAATTTCCCGGTGACCCCGGGGGAGCT 3* (SEQ ID N°153). 

[0330] The pP6 vector is successively digested with Sfi1 and BamHI restriction enzymes (Biolabs) for 1 hour at 37°C, 
extracted, precipitated and resuspended in water. Digested plasmid vector backbones are purified on a separation 
column (Chromaspin TE 400, Clontech), according to the manufacturer's protocol. 
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1.A.4 Ligation between vector and insert of genomic DNA 

r 

[0331] The prepared vector is Ngated overnight at 15°C with the genomic blunt-ended DNA described in section 2 
using T4 DNA ligase (Biolabs). The DNA is then precipitated and resuspended in water. 

5 

1.A.5. Library transformation in Escherichia coii. 

[0332] Transform DNA from section 1 .A.4. into Electromax DH10B electrocompetent ells (Gibco BRL) with Cell Po- 
rator apparatus (Gibco BRL). Add 1 ml SOC medium and incubate transformed cells at 37°C for 1 hour. Add 9 ml 
10 volume of SOC medium per tube and plate on LB+ampicillin medium. Scrape colonies with liquid LB medium. Aliquot 
and freeze at -80°C. 

[0333] The obtained collection of recombinant cell clones is named HGXBHCV1. 

1. B. Collection transformation in Saccharomyces cerevisiae 

15 

[0334] The Saccharomyces cerevisiae strain (Y187 (MATa Gal4A Ga180A ade2-101 His3 Leu2-3, -112 Trp1-901 
Ura3-52 URA3::UASGAL1-LacZ Met) transformed with the HGXBHCV1 HCV genomic DNA library. 
[0335] The plasmid DNA contained in E. coii are extracted (Qiagen) from aliquoted E. coii frozen cells (1 .A.5.). 
[0336] Grow Saccharomyces cerevisiae yeast Y187 in YPGIu. 
20 [0337] Yeast transformation is performed according to standard protocol (GIEST et al. Yeast, 11, 355-360, 1995) 
using yeast carrier DNA (Clontech). This experiment leads to 10 4 to 5.1 0 4 cells/jxg DNA. Spread 2.10 4 cells on DO-Leu 
medium per plates. Aliquot and freeze at -80°C. The obtained collection of recombinant cell clones is named 
HGXYHCV1. 

25 1.C. Construction of bait piasmids 

[0338] Plasmid pB5 (see figure 12) is prepared by replacing the Ncol/Sall polyiinker fragment with the double-strand- 
ed oligonucleotide. 

30 ' 

5'CATGGCCGCAGGGGCCGCGGCCGCACTAGTGGGGATCCTTAATTAAAGGGCCA 
CTGGGGCCCCCCGGCGTCCCCGGCGCCGGCGTGATCACCCCTAGGAATTAATTT 
CCCGGTGACCCCGGGGGAGCT 3'.( SEQ iD N°154). 

35 

[0339] The linkered genomic DNA described in section 2 is ligated into pB5 that has been digested with Sfi1 restriction 
enzyme and DNA transformed into competent E. coii. Cells are grown and plasmid DNA extracted and sequenced. 
Those piasmids which code in-frame fusion proteins are used as bait piasmids. 

40 

EXAMPLE 2 : Screening the collection with the two-hybrid in yeast system. 

2. A. The mating protocol. 

45 [0340] We have chosen the mating two-hybrid in yeast system (firstly described by FROMONT-RACINE et al., Nature 
Genetics, 1997, vol. 16, 277-282, Toward a functional analysis of the yeast genome through exhaustive two-hybrid 
screens) for its advantages but we could also screen the HCV collection in classical two-hybrid system as described 
in Fields et a!, or in a yeast reverse two-hybrid system. 

[0341] The mating procedure allows a direct selection on selective plates because the two fusion proteins are already 
50 produced in the parental cells. No replica plating is required. This protocol is written for the use of the library transformed 
into the Y187 strain. 

[0342] Before mating, transform S. cerevisiae (CG 1945 strain (MATa Ga14-542 Ga1180-538 ade2-101 His3*200 
Leu2-3, -112 Trp1-901 Ura3-52 Lys2-801 URA::GAL4 17 mers (X3)- CyC 1 TATA- La cZ LYS2:: 
GAL1UAS-GAL1TATA-HIS3 CYH R )) according to step 1.B. and spread on DO-Trp medium. 

55 

Day 1, morning: preculture 

[0343] Preculture of Y187 cells carrying the bait plasmid obtained at step 1.C. in 20 ml DO-Trp medium. Grow at 
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30°C with vigorous agitation. 

Day 1, late afternoon: culture 

5 [0344] Measure OD 600nm of the DO-Trp pre-culture of Y187 cells carrying the bait plasmid preculture. The OD 600nm 
must lie between 0.1 and 0.5 in order to correspond to a linear measurement. Inoculate 50 ml DO-Trp at OD 600nm 
0.006/ml, grow overnight at 30°C with vigorous agitation. 

Day 2 : mating 

10 

medium and plates 
[0345] 

15 1 YPGIu 15 cm plate 

50 ml tube with 13 ml DO-Leu-Trp-His 

1 00 ml flask with 5 ml of YPGIu 

8 DO-Leu-Trp-His plates 

2 DO-Leu plates 
20 2 DO-Trp plates 

2 DO-Leu-Trp plates 

Measure OD 600nm of the DO-Trp culture. It should be around 1 . 

[0346] For the mating, you must use twice as many bait cells as library cells. To get a good mating efficiency, you 
25 must collect the cells at 10 8 cells per cm 2 . 

[0347] Estimate the amount of bait culture (in ml) that makes up 30 OD 600nm units for the mating with the prey library. 
[0348] Thaw a vial containing the HGXYHCV1 library slowly on ice. Add the 0.5 ml of the vial to 5 ml YPGIu. Let 
those cells recover at 30°C, under gentle agitation for 10 minutes. 

30 Mating 

[0349] Put the 30 OD 600nm units of bait culture into a 50 ml flacon tube. 

[0350] Add the HGXYHCV1 library culture to the bait culture. Centrifuge, discard the supernatant and resuspend in 
0.8 ml YPGIu medium. 

35 [0351] Distribute the cells onto a YPGIu plate with glass beads. Spread cells by shaking the plates. 
[0352] Incubate the plate cells-up at 30°C for 4 h 30 min. 

Collection of mated cells 

40 [0353] Wash and rinse the plate with 6 ml and 7 ml consecutively of DO-Leu-Trp-His. 

[0354] Perform two parallel serial ten-fold dilutions in 500 uJ DO-Leu-Trp-His up to 1/10,000. Spread out 50 jil of 
each 1/10000 dilution onto DO-Leu and DO-trp plates and 50 |xl of each 1/1000 dilution onto DO-Leu-Trp"plates. 
[0355] Spread 3.2 ml of collected cells in 400 u.l aliquots on DO-Leu-Trp-His+Tet plates. 

45 DAY 4 

[0356] Selection of clones able to grow on DO-Leu-Trp-His+Tetracyclin: this medium allows us to isolate diploid 
clones presenting an interaction. 

[0357] Count the Trp+Leu+ colonies on control plates and the total number of His+ colonies on the DO-Leu-Trp- 
50 His+Tetracyclin plates. 

[0358] The number of His+ cell clones will define which protocol is to be processed: Upon 2.10 6 Trp+Leu+ colonies: 

if number of His+cell clones < 95: then process luminometry protocol on all colonies; 
if number of His+ cell clones > 95 and <5000: then process luminometry protocol on 95 colonies; 
55 - if number of His+ cell clones >500: repeat screen using DO-Leu-Trp-His+Tetracyclin plates containing 3-aminot- 
riazol. 
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2. B The luminometry assay 

[0359] Grow His+ colonies overnight at 30°C in microtiter plates containing DO-Leu-Trp-His-Tetracyclin medium with 
shaking. The day after, dilute 15 times overnight culture into a new microtiter plate containing the same medium. 
Incubate 5 hours at 30°C with shaking. Dilute samples 5 times and read OD 600nm . Dilute again to obtain between 10 
000 and 75 000 yeast cells/well in 100 ul final volume. 

[0360] Per well, add 76 uJ of One Step Yeast Lysis Buffer (Tropix), 20 \i\ Sapphirell Enhancer (Tropix), 4 u.l Galacton 
Star (Tropix), incubate 40 minutes at 30°C. 

[0361] Measure the p-Gal read-out (L) using a Luminometer (Trilux, Watlach). 

[0362] Calculate value of OD 600nm xL and selected interacting preys having highest values. 

[0363] At this step of the protocol, we have isolated diploid cell clones presenting interaction. The next step is now 
to identify polypeptides involved in the selected interactions. 

EXAMPLE 3: Identification of positive clones 

3. A. PCR on yeast colonies 
Introduction 

[0364] PCR amplification of fragments of plasmid DNA directly on yeast colonies is a quick and efficient procedure 
to identify sequences cloned into this plasmid. It is directly derived from a published protocol (Wang H. et al., Analytical 
Biochemistry, 237, 145-146, 1996). However, it is not a standardized protocol: in our hands it varies from strain to 
strain, and is dependent on experimental conditions (number of cells, Taq polymerase source, etc). This protocol should 
be optimized to specific local conditions. 

MATERIALS 

[0365] 

For 1 well, PCR mix composition is: 
32.5 u.l water, 

5 U.M0X PCR buffer (Pharmacia), 
1 nl dNTP 10 mM, 

0,5 uJ Taq polymerase (85u7u.l -Pharmacia), 

0,5 pJ oligonucleotide ABS1 10 pmole/uJ:5'-GCGTTTGGAATCACTACAGG-3\ 
0,5 uJ oligonucleotide ABS2 10 pmole/|i.l:5 , -CACGATGCACGTTGAAGTG-3\ 

- INNaOH. 

Experiment 

[0366] Grow positive colonies overnight at 30°C on a 96 well cell culture cluster (Costar), containing 150 uJ DO-Leu- 
Trp-His+Tetracyclin with shaking. Resuspend culture and transfer immediately 100 u.l on a Thermowell 96 (Costar). 
[0367] Centrifuge 5 minutes at 4000 rpm at room temperature. 
[0368] Remove supernatant. Dispense 5 u.l NaOH in each well, shake 1 minute. 

[0369] Place the Thermowell in the thermocycler (GeneAmp 9700, Perkin Elmer) 5 minutes at 99,9°C and then 10 
minutes at 4°C. 

[0370] In each well, add PCR mix, shake well. 

Set up the PCR program as followed: 

94°C 3 minutes 

94°C 30 seconds 

53°C 1 minute 30 seconds x 35 cycles 

72°C 3 miutes 

72°C 5 minutes 

15°C - 
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[0371] Check the quality, the quantity and the length of the PCR fragment on agarose gel. 

[0372] The length of the cloned fragment is the estimated length of the PCR fragment minus 300 base pairs that 
correspond to the amplified flanking plasmid sequences. - 

3.B Plasmids rescue from yeast by electroporation 

Introduction 

[0373] The previous protocol of PCR on yeast cell may not be successful, in such a case, we rescue plasmids from 
yeast by electroporation. This experiment allows the recovery of prey plasmids from yeast cells by transformation of 
E.coli with a yeast cellular extract. We can then amplify the prey plasmid and sequence the cloned fragment. 

Material 

Plasmid rescue 

[0374] Glass beads 425-600 urn (Sigma) 

[0375] Phenol/chloroform (1/1) premixed with isoamyl alcohol (Amresco) 

[0376] Extraction buffer: 2% Triton X100, 1% SDS, 100 mM NaCI, 10 mM TrisHCI pH 8,0, 1 mM EDTA pH 8.0. 
[0377] Mix ethanol/NH 4 Ac: 6 volumes ethanol with 7.5 M NH 4 Acetate, 70% Ethanol and yeast cells in patches on 
plates. 

Electroporation 

[0378] 

SOC medium 
M9 medium 

Selective plates: M9-Leu+Ampicillin 

2 mm electroporation cuvettes (Eurogentec) 

Experiment 

Plasmid rescue 

[0379] Prepare cell patch on DO-Leu-Trp-His with cell culture of section 2.C. 

[0380] Scrape the cell of each patch in Eppendorf tube, add 300 uJ of glass beads in each tube, then add 200 uJ 
extraction buffer and add 200uJ phenol: chloroform: isoamyl alcohol (25:24:1). 
Centrifuge tubes 10 minutes at 15000 rpm. 

Transfer 180 u.l supernatant to a sterile Eppendorf tube and add to each 500 uJ ethanol/NH 4 Ac, vortex. 

Centrifuge tubes 15 minutes, 15000 rmp at 4°C. 

Wash pellet with 200 u.l 70% ethanol, remove ethanol and dry pellet, 

Resuspend pellet in 10 u.l water. Store extracts at -20°C. 

Electroporation 

[0381] Material: Electrocompetent MC1066 cells prepared according to standard protocols (Maniatis). 
Add 1 u.l of yeast plasmid DNA-extract to pre-chilled Eppendorf tube, and keep on ice. 

Mix 1 u.l plasmid yeast DNA-extract sample, add 20 uJ electrocompetent cells and transfer in a cold electroporation 
cuvette. 

Set the Btorad electroporator on 200 ohms resistance, 25 u.F capacity; 2.5 kV. Place cuvette in the cuvette holder and 
electroporate. 

Add 1 ml SOC into the cuvette and transfer the cell-mix into sterile Eppendorf tube. 

Let cells recover for 30 minutes at 37°C, spin the cells down 1 minute, 4000x g and pour off supernatant. Keep about 
100 nl medium and use it to resuspend the cells and spread them on selective plates (e.g. M9-Leu plates). 
Incubate plates for 36 hours at 37°C. 

Grow one colony and extract plasmids. Check presence and size of insert through enzymatic digestion and agarose 
gel. Sequence insert. 
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EXAMPLE 4: Protein-protein interaction. 

[0382] For each bait, the previously protocol leads to the identification of prey polynucleotide sequences. Using a 
suitable software program (eg Blastwun, available on the Internel site of the University of Washington: httpi/bioweb. 
5 pasteur.fr/seqanal/interfaces/blastwu.html) the region of the HCV genome is encoded by the prey fragment may be 
determined and whether the fusion proteins encoded are in the same open reading frame of translation as the HCV 
polyprotein or not. 

EXAMPLE 5 : Identification of SID® 

10 

[0383] The presence of contiguous polypeptides in the HCV genome and the high complexity of the prey library used 
prevents the determination of SID®s by previous means since prey fragments can overlap multiple polypeptides. The 
high complexity of the prey library used relative to the small genome size also prevented such a simple analysis since 
prey fragments can overlap multiple interacting domains. It was also necessary to overcome the problems caused by 
15 protein preys encoded by out-of-frame fusions of regions of the HCV genome. 

[0384] In order to determine the SID®s for a particular bait protein, it was therefore necessary to devise a suitable 
algorithm which would take into account all these problems: 

[0385] 5.1. The prey fragments are initially sorted according to which reading frame of the polypeptide sequence 
they correspond to. This enables the separation of physiologically relevant prey protein from out-of-frame fusions which 
20 bind in the two-hybrid assay. 

[0386] 5.2. Each prey fragment is compared pairwise with other prey fragments and two fragments are clustered 
together if they overlap by more than 30% of their lengths (see fig. 8). Further fragments are assigned to the cluster 
if, and only if, overlap all the fragments in the cluster by more than 30% of their length. 

[0387] 5.3 For each cluster of fragments thus produced, a pre-SID is defined as the intersection of all the fragments 

25 present in the cluster defined in 5.2 (figure 9). 

[0388] 5.4. The pre-SIDs defined in 5.3 are then analysed pairwise and if the region of intersection between two 
pre-SIDs is greater than 30 bp then a SID® is defined as this region of intersection. If the non-intersecting region of a 
pre-SID is of more than 30 bp in length and this non-intersecting region represents more than 30% of the length of one 
of the fragments that comprises this region, then this non-intersecting region is also defined as a SID®s (figure 10). 

30 [0389] 5.5 The number of fragments contributing to each SID defined in 5.4 is counted. In the case of overlapping 
SIDs®, the SID® which contains the most fragments is identified, and all the fragments which contribute to this SID® 
are removed from overlapping SIDs®. The inspection of the fragments which remain in these overlapping SIDs® de- 
termines the final sequence of the SID® (figure 11 ). 
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(1) Nucleic acid sequence encoding the polypeptide from the H77 strain of HCV which binds to the SID polypeptide (4) described in the same line. 



(2) 5'-end and 3'-end nucleotide positions of the sequence SEQ ID (1) in reference to the nomenclature disclosed by Yanagi et at. (1997) 

(3) Aminoacid sequence of the polypeptide from the H77 strain of HCV which binds to the SID polypeptide (4) described in the same line. 

(4) Nucleic acid sequence encoding the SID polypeptide which binds to the polypeptide of the aminoacid sequence (3) described in the same line. 

(5) Aminoacid sequence of the SID polypeptide which binds to the polypeptide of the aminoacid sequence (3) described in the same line. 
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(1 ) Nucleic acid sequence encoding the polypeptide from the H77 strain of HCV which binds to the SID polypeptide (4) described in the same line. 



(2) 5'-end and 3'-end nucleotide positions of the sequence SEQ ID (1 ) in reference to the nomenclature disclosed by Yanagi et al. (1997) 
55 (3) Aminoacid sequence of the polypeptide from the H77 strain of HCV which binds to the SID polypeptide (4) described in the same line. 

(4) Nucleic acid sequence encoding the SID polypeptide which binds to the polypeptide of the aminoacid sequence (3) described in the same line. 

(5) Aminoacid sequence of the SID polypeptide which binds to the polypeptide of the aminoacid sequence (3) described in the same line. 
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(1) Nucleic acid sequence encoding the polypeptide from the H77 strain of HCV which binds to the SID polypeptide (4) described in the same line. 



(2) 5'-end and 3'-end nucleotide positions of the sequence SEQ ID (1) in reference to the nomenclature disclosed by Yanagi et aJ. (1997) 

(3) Aminoacid sequence of the polypeptide from the H77 strain of HCV which binds to the SID polypeptide (4) described in the same line. 

(4) Nucleic acid sequence encoding the SID polypeptide which binds to the polypeptide of the aminoacid sequence (3) described in the same line. 

(5) Aminoacid sequence of the SID polypeptide which binds to the polypeptide of the aminoacid sequence (3) described in the same line. 
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TABLE 1 (continued) 





Summary of the protein-proptein interactions between the SID polypeptides of the invention and H77 strain 

HCV polypeptides 


5 


Bait 


SEQID 
N°(1) 


begin 
(2) 


end(2) 


SEQID 

N°{3) 


SID 


SEQ ID 

N° (4) 


begi n (2) 


end (2) 


SEQ ID 

N° (5) 




NS5B 
(100%) 


146 


7976 


8759 


109 


NS5B 
(100%) 


72 


7775 


8011 


35 


10 


NS5B 

(100%) 


147 


8564 


8948 


110 


E2 

\ I UU to ) 


73 


1805 


1887 


36 




NS5B 
(100%) 


148 


8708 


8978 


111 


(100%) 


74 


1751 


1865 


37 


15 


NS5B 
(100%) 


149 


8996 


9220 


112 


NS4B 

(57%)/ 
NS5A 
(41%) 


75 


6194 


6303 


38 


20 


NS58 
(100%) 


150 


9032 


9226 


113 


NS4B 

(63%)/ 
NS5A 
(35%) 


76 


6206 


6286 


39 



(1 ) Nucleic acid sequence encoding the polypeptide from the H77 strain of HCV which binds to the SID polypeptide (4) described in the same line. 



(2) 5-end and 3'-end nucleotide positions of the sequence SEQ ID (1) in reference to the nomenclature disclosed by Yanagi et al. (1997) 

(3) Aminoacid sequence of the polypeptide from the H77 strain of HCV which binds to the SID polypeptide (4) described in the same line. 

(4) Nucleic acid sequence encoding the SID polypeptide which binds to the polypeptide of the aminoacid sequence (3) described in the same line. 

(5) Aminoacid sequence of the SID polypeptide which binds to the polypeptide of the aminoacid sequence (3) described in the same line. 
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SEQUENCE LISTING 

<110> HYBRIGENICS S.A. 

<120> SID nucleic acids and polypeptides selected from a 
pathogenic strain of the hepatitis C virus and 
applications 

<130> Hybrigenics - SID HCV 

<140> 
<141> 

<160> 156 

<170> Patent In Ver. 2.1 

<210> 1 

<211> 50 

<212> PRT 

<213> Hepatitis C virus 

<400> 1 

25 Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys 

1 5 10 15 

Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro lie Pro Lys 
20 25 30 



10 



15 



20 



30 



35 



40 



45 



50 



Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp 
35 40 45 

Pro Leu 
50 



<210> 2 
<211> 35 
<212> PRT 

<213> Hepatitis C virus 
<400> 2 

Gly Arg Gly Lys Pro Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg 
1 5 10 15 

Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
20 25 30 

Gly Cys Ala 
35 



55 
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<210> 3 
<211> 77 
<212> PRT 

<213> Hepatitis C virus 
<400> 3 

Asn Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin 
10 i s 10 15 

lie Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 
20 25 30 

15 val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg 

35 40 .45 

Arg Gin Pro He Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala 
50 55 60 

20 

Gin Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 
65 70 75 



30 



25 

<210> 4 
<211> 37 
<212> PRT 

<213> Hepatitis C virus 
<400> 4 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr 

15 10 15 

*■ 

Tyr Ser Trp Gly Ala Asn Asp Thr Asp Val Phe Val Leu Asn Asn Thr 
35 " 20 25 30 

Arg Pro Pro Leu Gly 
35 

40 

<210> 5 
<211> 150 
<212> PRT 
45 <213> Hepatitis C virus 

<400> 5 

Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro Gly lie 
15 10 15 



50 



55 



Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser 

20 25 30 

Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu 

35 40 45 
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Thr Pro Ala Glu Thr Thr Val Arg 
50 55 

Gly Leu Pro Val Cys Gin Asp His 
65 70 

Thr Gly Leu Thr His lie Asp Ala 
85 

Ser Gly Glu Asn Phe Pro Tyr Leu 
100 

Ala Arg Ala Gin Ala Pro Pro Pro 
115 120 

Leu He Arg Leu Lys Pro Thr Leu 
130 135 

Arg Leu Gly Ala Val Gin 
145 150 



Leu Arg Ala Tyr Met Asn Thr Pro 
60 

Leu Glu Phe Trp Glu Gly Val Phe 

75 80 

His Phe Leu Ser Gin Thr Lys Gin 
30 95 

Val Ala Tyr Gin Ala Thr Val Cys 
105 110 

Ser Trp Asp Gin Met Trp Lys Cys 
125 

His Gly Pro Thr Pro Leu Leu Tyr 
14 0 



<210> 6 
<211> 28 
<212> PRT 

c213> Hepatitis C virus 
<400> 6 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr 
15 10 15 

Tyr Ser Trp Gly Ala Asn Asp Thr Asp Val Phe Val 
20 25 



<210> 7 
c211> 26 
<212> PRT 

<213> Hepatitis C virus 
<400> 7 

Pro Pro Arg Pro Cys Gly He Val Pro Ala Lys Ser Val Cys Gly Pro 
1 5 10 15 

Val Tyr Cys Phe Thr Pro Ser Pro Val Val 
20 25 



<210> 8 
<211> 54 
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<212> PRT 

<213> Hepatitis C virus 
<400> 8 

Cys Val Val He Val Gly Arg He Val Leu Ser Gly Lys Pro Ala Xle 
15 10 15 

He Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu 
20 25 30 

Cys Ser Gin His Leu Pro Tyr He Glu Gin Gly Met Met Leu Ala Glu 
35 40 45 

Gin Phe Lys Gin Lys Ala 
50 



<210> 9 
<211> 40 
<212> PRT 

c213> Hepatitis C virus 
<400> 9 

Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr 
1 5 10 15 

Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr Thr Leu 

20 25 30 

Pro Gin Asp Ala Val Ser Arg Thr 
35 40 



<210> 10 
<211> 28 
<212> PRT 

<213> Hepatitis C virus 
<400> 10 

Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 
15 10 15 

Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala 
20 25 



<210> 11 
<211> 27 
<212> PRT 

<213> Hepatitis C virus 
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<400> 11 

Arg Gly Lys Pro Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro 
1 5 10 15 

Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
20 25 



<210> 12 
c211> 42 
<212> PRT 

<213> Hepatitis C virus 
<400> 12 

Leu Glu Asp Ser Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn 
l " 5 10 15 

Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 
20 25 30 

Leu lie Val Phe Pro Asp Leu Gly Val Arg 
35 40 



<210> 13 
<211> 28 
<212> PRT 

<213> Hepatitis C virus 
<400> 13 

Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala 
1 * 5 10 15 

Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val 
20 25 



<210> 14 
<211> 33 
<212> PRT 

<213> Hepatitis C virus 
<400> 14 

Glu Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Pro Cys Gly He Val 
1 5 .10 15 

Pro Ala Lys Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro 
20 25 30 

Val 
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10 



15 



<210> 15 
<211> 31 
<212> PRT 

<213> Hepatitis C virus 
<400> 15 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr 
15 10 15 

Tyr Ser Trp Gly Ala Asn Asp Thr Asp Val Phe Val Leu Asn Asn 
20 25 30 



<210> 16 
<211> 77 
20 <212> PRT 

<213> Hepatitis C virus 

<400> 16 

Ala Gly Ala Leu Val Ala Phe Lys lie Met Ser Gly Glu Val Pro Ser 
25 1 5 10 15 

Thr Glu Asp Leu Val Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala 
20 25 30 

30 Leu Val Val Gly Val Val Cys Ala Ala lie Leu Arg Arg His Val Gly 

35 40 45 

Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu lie Ala Phe Ala 
50 55 60 



35 



40 



45 



50 



55 



Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro 
65 70 75 



<210> 17 
<211> 147 
<212> PRT 

<213> Hepatitis C virus 
<400> 17 

Glu Val Gin He Val Ser Thr Ala Thr Gin Thr Phe Leu Ala Thr Cys 
15 10 15 

He Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Thr Arg Thr 
20 25 30 

He Ala Ser Pro Lys Gly Pro Val lie Gin Met Tyr Thr Asn Val Asp 
35 40 45 



42 



EP1 178 116 A9 (W1A1) 



Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly Ser Arg Ser Leu Thr 
5 50 55 60 

Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 
GS 70 75 80 

10 Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu 

85 90 95 

Ser Pro Arg Pro lie Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu 
100 105 110 

15 

Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe Arg Ala Ala Val Cys 
115 120 125 

Thr Arg Gly Val Ala Lys Ala Val Asp Phe He Pro Val Glu Asn Leu 
20 130 135 140 

Gly Thr Thr 
145 

25 

<210> 18 

<211> 36 

<212> PRT 

<213> Hepatitis C virus 



30 



35 



<400> 18 

Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys 
15 10 15 

Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He 
20 25 30 

Cys Glu Val Leu 
40 35 



<210> 19 
45 <211> 28 

<212> PRT 

<213> Hepatitis C virus 



50 



55 



<400> 19 

Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val val Gly 
15 10 15 

Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser Trp 
20 25 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



<210> 20 
<211> 45 
<:212> PRT 

<213> Hepatitis C virus 
<400> 20 

Pro Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg 
15 10 15 

Leu Leu Ser Arg Gly Gly Arg Ala Ala He Cys Gly Lys Tyr Leu Phe 
20 25 30 

Asn Trp Ala Val Arg Thr Lys Leu Lys Leu Tbx Pro He 
35 40 45 



<210> 21 
<211> 86 
<212> PRT 

<213> Hepatitis C virus 
<400> 21 

Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser Val 
15 10 15 

Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly Ala Gly 
20 25 30 

Val Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro 
35 40 45 

Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly 
50 55 60 

Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val 
65 70 75 80 

Gly Pro Gly Glu Gly Ala 
85 



<210> 22 
<211> 43 
<212> PRT 

<213> Hepatitis C virus 
<400> 22 

Gly He Val Pro Ala Lys Ser Val Cys Gly Pro Val Tyr Cys Phe Thr 
15 10 15 

Pro Ser Pro Val Val val Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr 
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20 



25 



30 



Tyr Ser Trp GLy Ala Asn Asp Thr Asp Val Phe 
35 40 



<210> 23 
<211> 63 
<212> PRT 

<213> Hepatitis C virus 
<400> 23 

Val Leu Val Asp lie Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala 
1 5 10 15 

Leu Val Ala Phe Lys lie Met Ser Gly Glu Val Pro Ser Tlir Glu Asp 
20 25 30 

Leu Val Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala Leu Val Val 
35 40 45 

Gly Val Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly 
50 55 60 



<210> 24 
<211> 29 
<212> PRT 

<213> Hepatitis C virus 
<400> 24 

Glu Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Pro Cys Gly He Val 
15 10 15 

Pro Ala Lys Ser Val Cys Gly Pro Val Tyr Cys Phe Thr 
20 25 



<210> 25 
<211> 76 
<212> PRT 

<213> Hepatitis C virus 
<400> 25 

Arg Arg His Trp Thr Thr Gin Asp Cys Asn Cys Ser He Tyr Pro Gly 
1^5 10 15 

His He Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp Ser 
20 25 30 

Pro Thr Ala Ala Leu Val Val Ala Gin Leu Leu Arg He Pro Gin. Ala 
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35 40 45 

He Met Asp Met He Ala Gly Ala His Trp Gly Val Leu Ala Gly He 
50 55 60 

Ala Tyr Phe Ser Met Val Gly Asn Trp Ala Lys Val 
65 70 75 



<210> 26 
<211> 37 
<212> PRT 

<213:> Hepatitis C virus 
<400> 26 

Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His 
15 10 15 

Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu 
20 25 30 

Arg Asp He Trp Asp 
35 



<210> 27 
<211> 47 
<212> PRT 

<213> Hepatitis C virus 
<400> 27 

Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lya Pro Gly 
15 10 15 

He Tyr Arg Phe val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Asp 
20 25 30 

Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 

35 40 45 



<210> 28 
<211> 53 
<212> PRT 

<213> Hepatitis C virus 
<400> 28 

Leu Pro Ala Pro Asn Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu 
15 10 15 

Glu Tyr Val Glu He Arg Arg Val Gly Asp Phe His Tyr Val Ser Gly 
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20 25 30 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin lie Pro Ser Pro Glu 
35 40 45 

Phe Phe Thr Glu Leu 
50 



c210> 29 
<211> 112 
<212> PRT 

<213> Hepatitis C virus 
<400> 29 

Gly Glu lie Pro Phe Tyr Gly Lys Ala lie Pro Leu Glu Val He Lys 
1 5 10 15 

Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp Glu 
20 25 30 

Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala Tyr Tyr 
35 40 45 

Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val Val Val 
50 55 60 

Val Ser Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp Ser 
65 70 75 80 

Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu 
85 90 95 

Asp Pro Thr Phe Thr lie Glu Thr Thr Thr Leu Pro Gin Asp Ala Val 
100 105 110 



<210> 3 0 
<211> 54 
<212> PRT 

<213> Hepatitis C virus 
<400> 30 

Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
1 5 10 15 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
20 25 30 
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Val Ser Pro "Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
35 40 45 

Thx Ala lie Leu Ser Ser 
50 



<210> 31 
<211> 102 
<212> PRT 

<213> Hepatitis C virus 
<400> 31 

Ala lie Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr 

1 5 10 .15 

Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly 
20 25 30 

Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr lie Lys Ala 
35 40 45 

Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val 
50 55 60 

Cys Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly val Gin Glu 
65 70 75 80 

Asp Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser 
85 90 95 

Ala Pro Pro Gly Asp Pro 
100 



<210> 32 
<211> .79 
<212> PRT 

<213> Hepatitis C virus 
<400> 32 

Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys 
15 10 15 

Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala 
20 25 30 

Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly 
35 40 45 

Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Ala His He Asn 
50 55 60 
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Ser Val Trp Lys Asp Leu Leu Glu Asp Ser Val Thr Pro He Asp 
65 70 75 



<210> 33 
<211> 61 
<212> PRT 

<213> Hepatitis C virus 
<400> 33 

Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr 
15 10 15 

Thr Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg 
20 25 30 

Thr Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro Gly Glu 
35 40 45 

Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
25 ~ 50 55 60 



<210> 34 
30 <211> 77 

<212> PRT 

c213> Hepatitis C virus 



10 



15 



20 



35 



40 



45 



50 



55 



<400> 34 

Val Leu Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala 
15 10 15 

Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser 
20 25 30 

Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys 
35 40 45 

Asp Val Arg Cys His Ala Arg Lys Ala Val Ala His He Asn Ser Val 
50 55 60 

Trp Lys Asp Leu Leu Glu Asp Ser Val Thr Pro He Asp 
65 70 75 



<210> 35 
<211> 26 
<212> PRT 

<213> Hepatitis C virus 
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<400> 35 

Tyr Pro Pro Arg Pro Cys Gly He Val Pro Ala Lys Ser Val Cys Gly 
15 10 15 



10 



Pro Val Tyr Cys Phe Thr Pro Ser Pro Val 

20 25 



15 



<210> 36 
<211> 37 
<212> PRT 

<213> Hepatitis C virus 



<400> 36 

Pro He Ser Tyr Ala Asn Gly Ser Gly Leu Asp Glu Arg Pro Tyr Cys 
20 i 5 10 15 

Trp His Tyr Pro Pro Arg Pro Cys Gly lie Val Pro Ala Lys Ser Val 
20 25 30 

25 Cys Gly Pro Val Tyr 

3 5 



<210> 37 
30 <211> 35 

<212> PRT 

<213> Hepatitis C virus 
<400> 37 

35 Thr Val Thr Gin Leu Leu Arg Arg Leu Eis Gin Trp lie Ser Ser Glu 

15 10 15 

Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp 
20 25 30 



40 



He Cys Glu 
35 



45 <210> 38 

<211> 25 
<212> PRT 

<213> Hepatitis C virus 

50 <400> 38 

Leu Leu Arg Arg Leu His Gin Trp lie Ser Ser Glu Cys Thr Thr Pro 
15 10 15 



55 



Cys Ser Gly Ser Trp Leu Arg Asp lie 
20 25 
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<210> 39 
<211> 152 
<212> DNA 

<213> Hepatitis C virus 
<400> 39 

cttgttgccg cgcaggggcc ctagattggg tgtgcgcgcg acgaggaaga cttccgagcg 60 
gtcgcaacct cgaggtagac gtcagcctat ccccaaggca cgtcggcccg agggcaggac 120 
ctgggctcag cccgggtacc cttggcccct ct 152 



<210> 40 
<211> 106 
<212> DNA 

<213> Hepatitis C virus 
<400> 40 

tggcaggggg aagccaggca tctatagatt tgtggcaccg ggggagcgcc cctccggcat 60 
gttcgactcg tccgtcctct gtgagtgcta tgacgcgggc tgtgct 106 



<210> 41 
c2ll> 234 
<212> DNA 

<213> Hepatitis C virus 
<400> 41 

taacaccaac cgtcgcccac aggacgtcaa 
agtttacttg ttgccgcgca ggggccctag 
cgagcggtcg caacctcgag gtagacgtca 
caggacctgg gctcagcccg ggtacccttg 



gttcccgggt ggcggtcaga tcgttggtgg 60 
attgggtgtg cgcgcgacga ggaagacttc 120 
gcctatcccc aaggcacgtc ggcccgaggg 18 0 
gcccctctat ggcaatgagg gttg 2 34 



<210> 42 
<211> 114 
<212> DNA 

<213> Hepatitis C virus 
<400> 42 

tcccagcccc gtggtggtgg gaacgaccga caggtcgggc gcgcctacct acagctgggg 60 
tgcaaatgat acggatgtct tcgtccttaa caacaccagg ccaccgctgg gcaa 114 



<210> 43 
<211> 453 
<212> DNA 

<213> Hepatitis c virus 
<400> 43 

ctccaggact caacgccggg gcaggactgg cagggggaag ccaggcatct atagatttgt 60 
ggcaccgggg gagcgcccct ccggcatgtt cgactcgtcc gtcctctgtg agtgctatga 120 
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cgcgggctgt gcttggtatg agctcacgcc 
catgaacacc ccggggcttc ccgtgtgcca 
tacgggcctc actcatatag atgcccactt 
ctttccttac ctggtagcgt accaagccac 
atcgtgggac cagatgtgga agtgtttgat 
acccctgcta tacagactgg gcgctgttca 



cgccgagact acagttaggc tacgagcgta 1B0 
ggaccatctt gaattttggg agggcgtctt 240 
tttatcccag acaaagcaga gtggggagaa 300 
cgtgtgcgct agggctcaag cccctccccc .360 
ccgccttaaa cccaccctcc atgggccaac 420 
gaa 453 



<210> 44 

<211> B5 

<212> DNA 

<213> Hepatitis C virus 



<400> 44 

tcccagcccc gtggtggtgg gaacgaccga caggtcgggc gcgcctacct acagctgggg 60 
tgcaaatgat acggatgtct tcgtc 85 



<210> 45 
<211> 80 
<212> DNA 

<213> Hepatitis C virus 
<400> 45 

ccctccaaga ccttgtggca ttgtgcccgc aaagagcgtg tgtggcccgg tatattgctt 60 
cac tec cage cccgtggtgg 80 



<210> 46 
<211> 165 
<212> DNA 

<213> Hepatitis C virus 



<400> 46 

ctgcgtggtc atagtgggca ggategtett 
ggaggttctc taccaggagt tcgatgagat 
cgagcaaggg atgatgeteg ctgagcagtt 



gtccgggaag ceggcaatta tacctgacag 60 
ggaagagtgc tctcagcact tacegtacat 120 
caagcagaag geect 165 



<210> 47 
<211> 123 
<212> DNA 

<213> Hepatitis C virus 



<400> 47 

cggcgacttc gactctgtga ragactgeaa 
ccttgaccct acctttacca ttgagacaac 
tea 



cacgtgtgtc actcagacag tcgatttcag 60 
cacgctcccc caggatgetg tctccaggac 120 

123 



<210> 48 
<211> 87 
<:212> DNA 

<:213> Hepatitis C virus 
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<400> 48 

ggagcgcccc tccggcatgt tcgactcgtc cgtcctctgt gagtgctatg acgcgggctg 60 
tgcttggtat gagctcacgc ccgccga 87 

<210> 49 
<211> 84 
<212> DNA 

<213> Hepatitis C virus 
<400> 49 

cagggggaag ccaggcatct atagatttgt ggcaccgggg gagcgcccct ccggcatgtt 60 
cgactcgtcc gtcctctgtg agtg 84 



<210> 50 
<211> 128 
<212> DNA 

<213> Hepatitis C virus 



<400> 50 

tctggaagac agtgtaacac caatagacac 
cgttcagcct gagaaggggg gtcgtaagcc 
cgtgcgcg 



taccatcatg gccaagaacg aggttttctg 60 
agctcgtctc atcgtgttcc ccgacctggg 12 0 

128 



<210> 51 
<211> B5 
<212> DNA 

<213> Hepatitis C virus 



<400> 51 

tcccaccggc agcggtaaga gcaccaaggt cccggctgcg tacgcagccc agggctacaa 6 0 
ggtgttggtg ctcaacccct ctgtt 65 



<210> 52 
<211> 102 
<212> DNA 

<2l3> Hepatitis C virus 
<400> 52 

cgaacgcccc tactgctggc actaccctcc aagaccttgt ggcattgtgc ccgcaaagag 60 
cgtgtgtggc ccggtatatt gcttcactcc cagccccgtg gt 102 



<210> 53 
<211> 95 
<212> DNA 

<213> Hepatitis C virus 



<400> 53 

tcccagcccc gtggtggtgg gaacgaccga caggtcgggc gcgcctacct acagctgggg 60 
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tgcaaatgat acggatgtct tcgtccttaa caaca 95 



<210> 54 
<211> 234 
<212> DNA 

<213> Hepatitis C virus 
<400> 54 

ggcgggagct cttgtagcat tcaagatcat gagcggtgag gtcccctcca cggaggacct 60 
ggtcaatctg ctgcccgcca tcctctcgcc tggagccctt gtagtcggtg tggtctgcgc 120 
agcaatactg cgccggcacg ttggcccggg cgagggggca gtgcaatgga tgaaccggct 180 
aatagccttc gcctcccggg ggaaccatgt ttcccccacg cactacgtgc cgga 234 



<210> 55 
<211> 442 
<212> DNA 

<213> Hepatitis C virus 
<400> 55 

tgaggtccag atcgtgtcaa ctgctaccca aaccttcctg gcaacgtgca tcaatggggt 60 
atgctggact gtctaccacg gggccggaac gaggaccatc gcatcaccca agggtcctgt 120 
catccagatg tataccaatg tggaccaaga ccttgtgggc tggcccgctc ctcaaggttc 180 
ccgctcattg acaccctgta cctgcggctc ctcggacctt tacctggtca cgaggcacgc 240 
cgatgtcatt cccgtgcgcc ggcgaggtga tagcaggggt agcctgcttt cgccccggcc 300 
catttcctac ttgaaaggct cctcgggggg tccgctgttg tgccccgcgg gacacgccgt 3 60 
gggcctattc agggccgcgg tgtgcacccg tggagtggct aaagcggtgg actttatccc 420 
tgtggagaac ctagggacaa cc 442 



<210> 56 
<211> 111 
<212> DNA 

<213> Hepatitis C virus 
<400> 56 

tgtaacccag ctcctgaggc gactgcatca gtggataagc tcggagtgta ccactccatg 60 
ctccggttcc tggctaaggg acatctggga ctggatatgc gaggtgctga g 111 



<210> 57 
<211> 87 
<212> DNA 

<213> Hepatitis C virus 
<400> 57 

cgtgtgtggc ccggtatatt gcttcactcc cagccccgtg gtggtgggaa cgaccgacag 60 
gtcgggcgcg cctacctaca gctgggg 87 



<210> 58 
<211> 137 
<212> DNA 
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<213> Hepatitis C virus 



<400> 58 

cccgcccttg cgagcttgga gacaccgggc 
aggaggcagg gctgccatat gtggcaagta 
caaactcact ccaatag 



ccggagcgtc cgcgctaggc ttctgtccag 60 
cctcttcaac tgggcagtaa gaacaaagct 120 

137 



<210> 59 
<211> 259 
<212> DNA 

<213> Hepatitis C virus 
<400> 59 

tactgccttt gtgggtgctg gcctagctgg cgccgccatc ggcagcgttg gactggggaa 60 
ggtcctcgtg gacattcttg cagggtatgg cgcgggcgtg gcgggagctc ttgtagcatt 120 
caagatcatg agcggtgagg tcccctccac ggaggacctg gtcaatctgc tgcccgccat 180 
cctctcgcct ggagcccttg tagtcggtgt ggtctgcgca gcaatactgc gccggcacgt 240 
tggcccgggc gagggggca 259 



<210> 60 
<211> 130 
<212> DNA 

c213> Hepatitis C virus 



<400> 60 

tggcattgtg cccgcaaaga gcgtgtgtgg 
ggtggtggga acgaccgaca ggtcgggcgc 
ggatgtcttc 



cccggtatat tgcttcactc ccagccccgt 60 
gcctacctac agctggggtg caaatgatac 12 0 

130 



<210> 61 
<211> 191 
c2l2> DNA 

<213> Hepatitis C virus 
<400> 61 

ggtcctcgtg gacattcttg cagggtatgg 
caagatcatg agcggtgagg tcccctccac 
cctctcgcct ggagcccttg tagtcggtgt 
tggcccgggc g 



cgcgggcgtg gcgggagctc ttgtagcatt 60 
ggaggacctg gtcaatctgc tgcccgccat 120 
ggtctgcgca gcaatactgc gccggcacgt 180 

191 



<210> 62 
c211> 89 
<212> DNA 

<213> Hepatitis C virus 
<400> 62 

cgaacgcccc tactgctggc actaccctcc aagaccttgt ggcattgtgc ccgcaaagag 60 
cgtgtgtggc ccggtatatt gcttcactc 89 
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<210> 63 
<211> 230 
<212> DNA 

<213> Hepatitis C virus 
<400> €3 

caggcgccac tggacgacgc aagactgcaa 
tcatcgcatg gcatgggata tgatgatgaa 
tcagctgctc cggatcccac aagccatcat 
cctggcgggc atagcgtatt tctccatggt 



ttgttctatc tatcccggcc atataacggg 60 

ctggtcccct acggcagcgt tggtggtagc 120 

ggacatgatc gctggtgctc actggggagt 180 

ggggaactgg gcgaaggtcc 230 



<2Z0> 64 
<211> 113 
<212> DMA 

<213> Hepatitis C virus 
<400> 64 

tgccatactc agcagcctca ctgtaaccca gctcctgagg cgactgcatc agtggataag 60 
ctcggagtgt accactccat gctccggttc ctggctaagg gacatctggg act 113 



<210> 65 
<211> 142 
<212> DNA 

<213> Hepatitis C virus 
<400> 65 

tgtctccagg actcaacgcc ggggcaggac tggcaggggg aagccaggca tctatagatt 60 
tgtggcaccg ggggagcgcc cctccggcat gttcgactcg tccgtcctct gtgagtgcta 120 
tgacgcgggc tgtgcttggt at 142 



<210> 66 
<211> 162 
<212> DNA 

<213> Hepatitis C virus 
<400> 66 

ccttcctgcg ccgaactata agttcgcgct gtggagggtg tctgcagagg aatacgtgga 60 
gataaggcgg gtgggggact tccactacgt atcgggtatg actactgaca atcttaaatg 120 
cccgtgccag atcccatcgc ccgaattttt cacagaattg ga 162 



<210> 67 
<211> 337 
<212> DNA 

<213> Hepatitis C virus 
<400> 67 

cggagagatc cccttttacg gcaaggctat 
tctcatcttc tgccactcaa agaagaagtg 
gggcatcaat gccgtggcct acta c eg egg 
cgatgttgtc gtcgtgtcga ccgatgctct 



ccccctcgag gtgatcaagg ggggaagaca 60 
cgacgagctc gccgcgaagc tggtegcatt 120 
tettgaegtg tctgtcatcc cgaccagcgg 180 
catgactggc tttaccggcg acttcgactc 240 
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tgtgatagac tgcaacacgt gtgtcactca gacagtcgat ttcagccttg accctacctt 300 
taccattgag acaaccacgc tcccccagga tgctgtc 337 



<210> 68 
<211> 163 
<212> DNA 

<213> Hepatitis C virus 



<400> 68 

ggtctgcgca gcaatactgc gccggcacgt tggcccgggc gagggggcag tgcaatggat 60 
gaaccggcta atagccttcg ccfccccgggg gaaccatgtt tcccccacgc actacgtgcc X2 0 
ggagagcgat gcagccgccc gcgtcactgc catactcagc age 163 



<210> 69 
<211> 309 
<212> DNA 

<213> Hepatitis C vims 
<:400> 69 

ggccatcaag tccctcactg agaggcttta tgttgggggc cctcttacca attcaagggg 60 
ggaaaactgc ggctaccgca ggtgccgcgc gageggegta ctgacaacta gctgtggtaa 120 
caccctcact tgetacatea aggcccgggc agectgtega gccgcagggc tccaggactg 180 
caccatgctc gtgtgtggcg acgacttagt cgttatctgt gaaagtgcgg gggtccagga 240 
ggacgcggcg agectgagag ccttcacgga ggctatgacc aggtactccg ccccccccgg 30 0 
ggacccccc 309 



<210> 70 
<211> 240 
<212> DNA 

<213> Hepatitis C virus 
<400> 70 

actgeaagtt ctggacagcc attaccagga cgtgctcaag gaggtcaaag cagcggcgtc 60 
aaaagtgaag getaacttge tatcegtaga ggaagcttgc agcctgacgc ccccacattc 120 
agccaaatcc aagtttggct atggggcaaa agaegtcegt tgccatgcca gaaaggccgt 180 
agcccacatc aactccgtgt ggaaagacct tctggaagac agtgtaacac caatagacac 240 



<210> 71 
<211> 184 
c212> DNA 

<213> Hepatitis C virus 
<400> 71 

cactcagaca gtcgatttca gccttgaccc tacctttacc attgagacaa ccacgctccc 60 
ecaggatget gtctccagga ctcaacgccg gggcaggact ggcaggggga agecaggcat 12 0 
ctatagattt gtggcaccgg gggagcgccc ctccggcatg ttcgactcgt ccgtcctctg 18 0 
tgag 184 



<210> 72 
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<211> 234 
<212> DNA 

<213> Hepatitis C virus 
<400> 72 

agttctggac agccattacc aggacgtgct 
gaaggctaac ttgctatccg tagaggaagc 
atccaagttt ggctatgggg caaaagacgt 
catcaactcc gtgtggaaag accttctgga 



caaggaggtc aaagcagcgg cgtcaaaagt 60 
ttgcagcctg acgcccccac attcagccaa 120 
ccgttgccat gccagaaagg ccgtagccca 1B0 
agacagtgta acaccaatag acac 234 



<210> 73 
<211> BO 
<212> DNA 

<213> Hepatitis C virus 
<400> 73 

ctaccctcca agaccttgtg gcattgtgcc cgcaaagagc gtgtgtggcc cggtatattg 60 
cttcactccc agccccgtgg 8 0 



<210> 74 
<211> 112 
<212> DNA 

<213> Hepatitis C virus 
<400> 74 

tcetatcagt tatgccaacg gaagcggcct cgacgaacgc ccctactgct ggcactaccc 60 
tccaagacct tgtggcattg tgcccgcaaa gagcgtgtgt ggcccggtat at 112 



<210> 75 
<211> 107 
<212> DNA 

<213> Hepatitis C virus 
<400> 75 

cactgtaacc cagctcctga ggcgactgca tcagtggata agctcggagt gtaccactcc 6 0 
atgctccggt tcctggctaa gggacatctg ggactggata tgcgagg 107 



<210> 76 
<211> 78 
<212> DNA 

<2132> Hepatitis C virus 
<400> 76 

gctcctgagg cgactgcatc agtggataag ctcggagtgt accactccat gctccggttc 60 
ctggctaagg gacatctg 7 8 



<210> 77 
<211> 103 
<212> PRT 
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<213> Hepatitis C virus 
<400> 77 

Ala Cye Glu Cys Pro Gly Arg Ser Arg Arg Pro Cys Thr Met Ser Thr 
IS 10 is 

10 Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asa Thr Asn Arg Arg Pro 

20 25 30 

Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly Gly Val Tyr 
35 40 45 

15 

Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys 
50 55 €0 

Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro He Pro Lys 
65 70 75 80 

Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp 
85 90 95 

Pro Leu Tyr Gly Asn Glu Gly 
25 100 



20 



30 



35 



<210> 78 
<211> 113 
<212> PRT 

<213> Hepatitis C virus 
<400> 78 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 

40 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 

35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 



45 



50 



55 



He Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 110 

Arg 
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<210> 79 
<211> 114 
<212> PRT 

<213> Hepatitis C virus 
<400> 79 

Ala He Leu His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn 
15 10 15 

Ala Ser Arg Cys Trp Val Ala Val Thr Pro Thr Val Ala Thr Arg Asp 
20 25 30 

Gly Lys Leu Pro Thr Thr Gin Leu Arg Arg His He Asp Leu Leu Val 
35 40 45 

Gly Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly 
50 55 60 

Ser Val Phe Leu Val Gly Gin Leu Phe Thr Phe Ser Pro Arg Arg His 
65 70 75 80 

25 

Trp Thr Thr Gin Asp Cys Asn Cys Ser lie Tyr Pro Gly His He Thr 
85 90 95 

Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp Ser Pro Thr Ala 
30 100 105 110 

Ala Leu 



10 



15 



20 



<210> 80 
<211> 91 
<212> PRT 
40 <213> Hepatitis C virus 

<400> 80 

Gly Val Asp Ala Glu Thr His Val Thr Gly Gly Asn Ala Gly Arg Thr 
1 5 10 15 

45 

Thr Ala Gly Leu Val Gly Leu Leu Thr Pro Gly Ala Lys Gin Asn He 
20 25 30 

Gin Leu He Asn Thr Asn Gly Ser Trp His He Asn Ser Thr Ala Leu 
50 35 40 45 

Asn Cys Asn Glu Ser Leu Asn Thr Gly Trp Leu Ala Gly Leu Phe Tyr 
50 55 60 
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Gin His Lys Phe Asn Ser Ser .Gly Cys Pro Glu Arg Leu Ala Ser Cys 
55 70 75 80 

Arg Arg Leu Thr Asp Phe Ala Gin Gly Trp Gly 
85 90 



<210> 81 
<211> 176 
<212> PRT 

<213> Hepatitis C virus 
<400> 81 

Trp Gly Pro lie Ser Tyr Ala Aan Gly Ser Gly Leu Asp Glu Arg Pro 
1 5 10 15 

Tyr Cys Trp His Tyr Pro Pro Arg Pro Cys Gly lie val Pro Ala Lys 
20 25 30 

Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val 
35 40 45 

Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser Trp Gly Ala Asn 
SO 55 60 

Asp Thr Asp val Phe val Leu Asn Asn Thr Arg Pro Pro Leu Gly Asn 
65 70 75 80 

Trp Phe Gly Cys Thr Trp Met Asn Ser Thr Gly Phe Thr Lys Val Cys 
85 SO 95 

Gly Ala Pro Pro Cys Val He Gly Gly Val Gly Asn Asn Thr Leu Leu 
100 105 110 

Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala Thr Tyr Ser Arg 
115 120 125 

Cys Gly Ser Gly Pro Trp lie Thr Pro Arg Cys Met val Asp Tyr Pro 
130 135 140 

Tyr Arg Leu Trp His Tyr Pro Cys Thr He Asn Tyr Thr He Phe Lys 
145 150 155 160 

Val Arg Met Tyr Val Gly Gly Val Glu His Arg Leu Glu Ala Ala Cys 
165 170 175 



<210> 82 
c211> 96 
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<212> PRT 

<213> Hepatitis C virus 
<400> 82 

Trp His Tyr Pro Pro Arg Pro Cys Gly He Val Pro Ala Lys Ser Val 
15 10 15 

Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr 
20 25 30 

Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser Trp Gly Ala Asn Asp Thr 
35 40 45 

Asp Val Phe Val Leu Asn Asn Thr Arg Pro Pro Leu Gly Asn Trp Phe 
50 55 60 

Gly Cys Thr Trp Met Asn Ser Thr Gly Phe Thr Lys Val Cys Gly Ala 
65 70 75 80 

Pro Pro Cys Val lie Gly Gly Val Gly Asn Asn Thr Leu Leu Cys Pro 
85 90 95 



<210> 83 
<211> 278 
<212> PRT 

<213> Hepatitis C virus 
<400> 83 

Ala Ala Cys Gly Asp He He Asn Gly Leu Pro Val Ser Ala Arg Arg 
X 5 10 15 

Gly Gin Glu He Leu Leu Gly Pro Ala Asp Gly Met Val Ser Lys Gly 
20 25 30 

Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ala Gin Gin Thr Arg Gly 
35 40 45 

Leu Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin 
50 55 60 

Val Glu Gly Glu Val Gin He Val Ser Thr Ala Thr Gin Thr Phe Leu 
65 70 75 80 

Ala Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly 
85 90 95 

Thr Arg Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met Tyr Thr 
100 105 HO 
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Asn val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly Ser Arg 
115 120 125 

Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr 
130 135 140 

Arg His Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly 
145 150 155 160 

Ser Leu Leu Ser Pro Arg Pro lie Ser Tyr Leu Lys Gly Ser Ser Gly 
165 170 175 

Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe Arg Ala 
180 185 190 

Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe lie Pro Val 
135 200 205 

Glu Asn Leu Gly Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser 
210 215 220 

Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala 
225 230 235 240 

Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala 
245 250 255 

Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu 
260 265 270 

Gly Phe Gly Ala Tyr Met 
275 



<210> B4 
<211> 158 
<212> PRT 

<213> Hepatitis C virus 
<400> 84 

Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro lie 
1 5 10 15 

Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly 
20 25 30 

His Ala Val Gly Leu Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala 
35 40 45 

Lys Ala Val Asp Phe lie Pro Val Glu Asn Leu Gly Thr Thr Met Arg 
50 55 60 

Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Ser 
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55 70 75 80 

5 

Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr 
85 90 95 

Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu 
10 100 105 110 

Asn Pro Ser val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys 
115 120 125 

15 Ala His Gly Val Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr 

130 135 140 

Thr Gly Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu 
145 150 155 



<210> 85 

<211> 263 

<212> PRT 

25 <213> Hepatitis C virus 

<400> 85 

Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro lie Ser Tyr Leu Lys 
15 10 15 

30 

Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly 
20 25 30 

Leu Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp 
35 40 45 

Phe lie Pro Val Glu Asn Leu Gly Thr Thr Met Arg Ser Pro Val Phe 
50 55 60 

Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala 
40 6 5 7 0 7 5 8 0 

His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala 
85 90 95 



35 



45 



50 



55 



Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val 
100 105 110 

Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val 
115 120 125 

Asp Pro Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly Ser Pro 
130 135 140 

He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser 
145 150 155 160 
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Gly Giy Ala Tyr Asp lie lie He Cys Asp Glu Cys His Ser Thr Asp 
165 170 175 

Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr 
180 185 190 

Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser 

195 200 205 

Val Thr Val Ser His Pro Asn He Glu Glu Val Ala Leu Ser Thr Thr 
210 215 220 

Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val He Lys 
225 230 235 240 

Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp Glu 
245 250 255 

Leu Ala Ala Lys Leu Val Ala 
260 



<210> 36 
<211> 194 
<212> PRT 

<213> Hepatitis C virus 
<400> 86 

Asp Asn Ser Ser Pro Pro Ala Val Pro Gin ser Phe Gin Val Ala His 
1 5 10 15 

Leu Hi a Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala 
20 25 30 

Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala 
35 40 45 

Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp 
50 55 60 

Pro Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly Ser Pro He 
65 70 75 80 

Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly 
B5 90 95 

Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr Asp Ala 
100 105 110 

Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala 
115 120 125 
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Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val 
130 135 140 

Thr Val Ser His Pro Asn lie Glu Glu val Ala Leu Ser Thr Thr Gly 
145 150 155 160 

Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val He Lys Gly 
165 170 175 

Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu 
180 185 190 

Ala Ala 



<210> 87 
<211> 205 
<212> PRT 

<213> Hepatitis C virus 
<:400> 87 

Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys 
15 10 15 

Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr 
20 25 30 

Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
35 40 45 



Thr Ala Thr Pro Pro Gly Ser Val 
50 55 

Glu Val Ala Leu Ser Thr Thr Gly 
65 70 

He Pro Leu Glu Val He Lys Gly 
85 



Thr Val Ser His Pro Asn He Glu 
60 

Glu He Pro Phe Tyr Gly Lys Ala 
75 80 

Gly Arg His Leu He Phe Cys His 
90 95 



Ser Lys Lys Lys Cy s Asp Glu Leu 
100 

lie Asn Ala Val Ala Tyr Tyr Arg 
115 120 

Thr Ser Gly Asp Val Val Val Val 
130 13S 

Phe Thr Gly Asp Phe Asp Ser Val 
145 150 

Gin Thr Val Asp Phe Ser Leu Asp 



Ala Ala Lys Leu Val Ala Leu Gly 
105 110 

Gly Leu Asp Val Ser Val He Pro 

125 

Ser Thr Asp Ala Leu Met Thr Gly 
14 0 

He Asp Cys Asn Thr Cys Val Thr 
155 160 

Pro Thr Phe Thr He Glu Thr Thr 



66 



10 



35 



40 



45 



EP1 178 116 A9 (W1A1) 



165 170 175 

Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr 
180 185 190 

Gly Arg Gly Lys Pro Gly lie Tyr Arg Phe Val Ala Pro 
195 200 205 



<210> 88 
<211> 186 
™ <212> PRT 

<213> Hepatitis C virus 

<400> 88 

Ser Thr Asp Ala Thr Ser lie Leu Gly lie Gly Thr Val Leu Asp Gin 
20 i 5 10 15 

Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro 
20 25 30 

25 pro Gly Ser Val Thr Val Ser His Pro Asn lie Glu Glu Val Ala Leu 

35 40 45 

Ser Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu 
50 55 60 

30 

Val He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys 
65 70 75 80 

Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val 
85 90 95 

Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp 
100 105 110 

Val Val Val Val Ser Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp 
115 120 125 

Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp 
130 135 140 

Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr Thr Leu Pro Gin 
145 150 155 160 

Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys 
50 165 170 175 

Pro Gly He Tyr Arg Phe Val Ala Pro Gly 
180 185 

55 
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<210> 89 
<211> 158 
<212> PRT 

<213> Hepatitis C virus 
<400> 89 

Val lie Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu 
15 10 15 

Asp Pro Thr Phe Thr lie Glu Thr Thr Thr Leu Pro Gin Asp Ala Val 
20 25 30 

Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro Gly He 
35 40 45 

Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser 
50 55 60 

Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu 
65 70 75 80 

Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn Thr Pro 
85 30 95 

Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly Val Phe 
100 105 110 

Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr Lys Gin 
115 120 125 

Ser Gly Glu Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys 
130 135 140 

35 Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 

145 150 155 
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40 <210> 90 

<211> 129 



45 



50 



55 



<212> PRT 

<213> Hepatitis C virus 
<400> 90 

Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser 
1 5 10 15 

Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr 
20 25 30 

Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly 
35 40 45 

Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr 
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50 55 

Gly Leu Thr Kis lie Asp Ala His 
65 70 

Gly Glu Asn Phe Pro Tyr Leu Val 
85 

Arg Ala Gin Ala Pro Pro Pro Ser 
100 

lie Arg Leu Lys Pro Thr Leu His 
115 120 

Leu 



60 

Phe Leu Ser Gin Thr Lye Gin Ser 
75 80 

Ala Tyr Gin Ala Thr Val Cys Ala 
90 95 

Trp Asp Gin Met Trp Lys Cys Leu 
105 110 

Gly Pro Thr Pro Leu Leu Tyr Arg 

125 



<210:> 91 
<211> 51 
<212> PRT 

<213> Hepatitis C virus 
<40Q> 91 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
15 10 15 

Tyr Cys Leu Ser Thr Gly Cys Val Val lie Val Gly Arg lie Val Leu 
20 25 30 

Ser Gly Lys Pro Ala lie He Pro Asp Arg Glu Val Leu Tyr Gin Glu 
35 40 45 

Phe Asp Glu 
50 



<210> 92 
<211> 18 
<212> PRT 

<213> Hepatitis C virus 
<400> 92 

Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val He Val 
15 10 15 

Gly Arg 



c210> 93 
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c211> 208 
<212> PRT 

<213> Hepatitis C virus 
<400> 93 

Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Gly Gin Thr Leu Leu 
15 10 15 

Phe Asn lie Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly 
20 25 30 

Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly 
35 40 45 

Ser Val Gly Leu Gly Lys Val Leu Val Asp lie Leu Ala Gly Tyr Gly 
50 55 60 

Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys lie Met Ser Gly Glu 
65 70 75 80 

Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala lie Leu Ser 
85 90 95 

Pro Gly Ala Leu Val Val Gly Val val Cys Ala Ala He Leu Arg Arg 
100 105 110 

His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He 
115 120 125 

Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro 
130 135 140 

Glu Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr 
145 150 155 160 

Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys 
165 170 175 

Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He 
180 185 190 

Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met 
195 200 205 



<210> 94 
<211> 207 
<212> PRT 

<213> Hepatitis C virus 

55 
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<400> 94 

Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser Val 
15 10 15 

Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly Ala Gly 
20 25 30 

Val Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro 
35 40 45 

Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly 
50 55 60 

Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val 
65 70 75 80 

Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala Phe 
85 90 95 

Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser 
100 105 110 

Asp Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr Val Thr 
115 120 125 

Gin Leu Leu Arg Arg Leu K±s Gin Trp He Ser Ser Glu Cys Thr Thr 
130 135 140 

Pro Cys Ser Gly Ser Trp Leu Arg Asp lie Trp Asp Trp He Cys Glu 
145 150 155 ISO 

Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin 
165 170 175 

Leu Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Arg Gly Val 
180 185 190 

Trp Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly Ala 
195 200 205 



<210> 95 
<211> 225 
<212> PRT 

<213:> Hepatitis C virus 
<400> 95 

Leu Val Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu 
15 10 15 

Val Ala Phe Lys lie Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu 
20 25 30 
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val Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala Leu Val Val Gly 
5 35 40 45 

Val Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly 
50 55 60 

10 Ala Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn 

65 70 75 80 

His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg 
85 90 95 

15 

Val Thr Ala lie Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg 
100 105 110 

Leu His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser 
115 120 125 

Trp Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe 
130 135 140 

Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro 
25 145 150 155 160 

Phe Val Ser Cys Gin Arg Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly 
165 170 175 



20 



30 



35 



40 



50 



55 



He Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val 
180 185 190 

Lys Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met 
195 200 205 

Trp Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr 
210 215 220 

Pro 
225 



<210> 96 
<211> 145 
45 <212> PRT 

<213> Hepatitis C virus 

<400> 36 

Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He 
15 10 15 



Met Ser Gly Glu val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro 
20 25 30 

Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala 
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35 40 45 

lie Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met 
50 55 60 

Asn Arg Leu lie Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr 
65 70 75 80 

His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Ala lie Leu 
,85 90 95 

Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp lie 
100 105 110 

Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp lie 
115 120 125 

Trp Asp Trp lie Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys 
130 135 140 

Ala 
145 



<210> 97 
<211> 54 
<e212> PRT 

<213> Hepatitis C virus 
<400> 97 

Ala Leu Val Val Gly Val Val Cys Ala Ala lie Leu Arg Arg His Val 
15 10 15 

Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu lie Ala Phe 
20 25 30 

Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser 
35 40 45 

Asp Ala Ala Ala Arg Val 
50 



<210> 98 
<211> 165 
<212> PRT 

<213> Hepatitis C virus 
<400> 98 

Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser 
1 5 10 15 
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Asp Ala Ala Ala Arg Val Thr Ala lie Leu Ser Ser Leu Thr Val Thr 
20 25 30 

Gin Leu Leu Arg Arg Leu His Gin Trp lie Ser Ser Glu Cys Thr Thr 
35 40 45 

Pro Cys Ser Gly Ser Trp Leu Arg Asp lie Trp Asp Trp Xle Cys Glu 
50 55 60 

Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin 
65 70 75 80 

Leu Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Arg Gly Val 
85 90 95 

Trp Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly Ala Glu 
100 105 110 

He Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly Pro Arg 
115 120 125 

Thr Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr Thr 
130 13 5 14 0 

Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Lys Phe Ala Leu 
145 150 155 160 

Trp Arg Val Ser Ala 
165 



<210> 99 
<211> 308 
<212> PRT 

<213> Hepatitis C virus 
<400> 99 

Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu Ser 
1 5 10 15 

Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp He Ser 
20 25 30 

Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He Trp 
35 40 45 

Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala 
50 55 60 

Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin Arg 
65 70 75 80 

Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly lie Met His Thr Arg Cys 
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BS 90 95 

His Cys Gly Ala Glu lie Thr Gly His Val Lys Asn Gly Thr Met Arg 
100 105 110 

He Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Pile Pro 
115 120 125 

He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn 
130 135 140 

Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu He 
145 150 155 160 

Arg Arg Val Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp Asn 
1S5 170 175 

Leu LyB Cys Pro Cys Gin He Pro Ser Pro Glu Phe Phe Thr Glu Leu 
180 185 190 

Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu 
195 200 205 

Arg Glu Glu Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val Gly 

210 215 220 

Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser 
225 230 235 240 

Met Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala Gly Arg Arg 
245 250 255 

Leu Ala Arg Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala Ser Gin 
260 265 270 

Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp Ser 
275 280 285 

Pro Asp Ala Glu Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met 
290 295 300 

Gly Gly Asn lie 
305 



<210> 100 
<211> 283 
<212> PRT 

<213> Hepatitis C virus 
<400> 100 

Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp 
15 10 15 
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lie Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp 
20 25 30 

He Trp Asp Trp lie Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu 
35 40 45 

Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys 
50 55 60 



Gin Arg Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly lie Met His Thr 

15 65 70 75 80 

Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys Asn Gly Thr 

85 90 95 



Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr 
100 105 110 

Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala 
115 120 125 

Pro Asn Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val 
130 135 140 

Glu He Arg Arg Val Gly Asp Phe His Tyr val Ser Gly Met Thr Thr 
30 145 150 155 160 

Asp Asn Leu Lys Cys Pro Cys Gin He Pro Ser Pro Glu Phe Phe Thr 
165 170 175 



Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro 
180 185 190 

Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro 
195 200 205 

Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu 
210 215 220 

Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala Gly 
225 230 235 240 

Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala 
245 250 255 

Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His 
260 265 270 

Asp Ser Pro Asp Ala Glu Leu He Glu Ala Asn 
275 280 
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<210> 101 
<211> 249 
<212> PRT 

<213> Hepatitis C virus 
<400> 101 

Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp lie Ser 
15 10 15 

Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp lie Trp 
20 25 30 

Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala 
35 40 45 

Lys Leu Met Pro Gin Leu Pro Gly lie Pro Phe Val Ser Cys Gin Arg 
20 50 55 60 

Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly He Met His Thr Arg Cys 
65 70 75 BO 



10 



15 



His Cys Gly Ala Glu He Thr Gly His Val Lys Asn Gly Thr Met Arg 
85 90 95 

He Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe Pro 
100 105 110 

He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn 
115 120 125 

Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu He 
130 135 140 

Arg Arg Val Gly Asp Phe His Tyr val Ser Gly Met Thr Thr Asp Asn 
145 150 155 160 

Leu Lys Cys Pro Cys Gin He Pro Ser Pro Glu Phe Phe Thr Glu Leu 
40 * 165 170 175 

Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu 
180 185 190 
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Arg Glu Glu Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val Gly 
v. 195 200 205 

Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser 
210 215 220 

Met Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala Gly Arg Arg 
225 230 235 240 

Leu Ala Arg Gly Ser Pro Pro Ser Met 
245 
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<210> 102 
<211> 85 
<212> PRT 

<213> Hepatitis C virus 
<400> 102 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe 
15 10 15 

Val Ser Cys Gin Arg Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly lie 
20 25 30 

Met His Thr Arg Cys His Cys Gly Ala Glu lie Thr Gly His Val Lys 
35 40 45 

Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
50 55 €0 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
65 70 75 80 

Leu Pro Ala Pro Asn 
85 



<210> 103 
<211> 94 
<212> PRT 

<213> Hepatitis C virus 
<400> 103 

Glu He Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly Pro 
15 10 15 

Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr 
20 25 30 

Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Lys Phe Ala 
35 40 45 

Leu Trp Axg Val Ser Ala Glu Glu Tyr Val Glu He Arg Arg Val Gly 
50 55 SO 

Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp Asn Leu Lys Cys Pro 
65 70 75 80 

Cys Gin He Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly 
85 90 
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<210> 104 
<211> 75 
<212> PRT 

<213> Hepatitis C virus 
<400> 104 

He Glu Ala Asn Leu Leu Trp Arg 
1 5 

Arg Val Glu Ser Glu Asn Lys Val 
20 

Leu Val Ala Glu Glu Asp Glu Arg 
35 40 

Leu Arg Lys Ser Arg Arg Phe Ala 

50 55 

Pro Asp Tyr Asn Pro Pro Leu Val 
65 ~ 70 



Gin Glu Met Gly Gly Asn Xle Thr 
10 15 

Val He Leu Asp Ser Phe Asp Pro 
25 30 

Glu Val Ser Val Pro Ala Glu He 
45 

Arg Ala Leu Pro Val Trp Ala Arg 
60 

Glu Thr Trp 
75 



<210> 105 
<211> 90 
<212> PRT 

<213> Hepatitis C virus 
<400> 105 

His Gly Cys Pro Leu Pro Pro Pro Arg Ser Pro Pro Val Pro Pro Pro 
15 10 15 

Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala 
20 25 30 

Leu Ala Glu Leu Ala Thr Lys Ser Phe Gly Ser Ser Ser Thr Ser Gly 
35 40 45 

He Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly 
50 55 60 

Cys Pro Pro Asp Ser Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu 
65 70 75 80 

Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser 
85 90 



<210> 106 
<211> 137 
<212> PRT 

<213> Hepatitis c virus 
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<400> 106 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys 
1 5 10 15 

Leu Pro He Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu 
20 25 30 

Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
35 40 45 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
50 55 60 

Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
65 70 75 80 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
85 90 95 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
100 105 110 

Ala His He Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Ser Val Thr 
115 120 125 

Pro He Asp Thr Thr He Met Ala Lys 
X30 135 



<210> 107 
<211> 300 
<212> PRT 

<213> Hepatitis C virus 
<400> 107 

Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His 
1 5 10 15 

Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His 
20 25 30 

Ala Arg Lys Ala Val Ala His He Asn Ser Val Trp Lys Asp Leu Leu 
35 40 45 

Glu Asp Ser Val Thr Pro He Asp Thr Thr He Met Ala Lys Asn Glu 
50 55 60 

Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 
65 70 75 80 

He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 

85 90 95 
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Tyr Asp Val Val Ser Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr 
5 100 105 110 

Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala 
115 120 125 

10 Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 

130 135 140 

Phe Asp Ser Thr Val Thr Glu Ser Asp lie Arg Thr Glu Glu Ala He 
145 150 155 160 

15 

Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala He Lys Ser 
165 170 175 

Leu Thr Glu Arg Leu Tyr Val Gly Gly. Pro Leu Thr Asn Ser Arg Gly 
180 185 190 

Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr- Thr 
195 200 205 

Ser Cys Gly Asn Thr Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys 
25 210 215 220 

Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp 
225 230 235 240 



20 



30 



35 



40 



Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser 
245 250 255 

Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 
260 265 270 

Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser 
275 280 285 

Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys 
230 295 300 



45 
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<210> 108 
<211> 199 
<212> PRT 

<213> Hepatitis C virus 
<400> 108 

Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe 
15 10 15 

Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Ala 
20 25 30 

His He Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Ser Val Thr Pro 
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10 



35 40 45 

lie Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro 
50 55 60 

Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp Leu 
65 70 75 80 

Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Lys 
85 90 95 

Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro 
15 100 105 110 

Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr 
115 120 125 



20 



25 



30 



Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr 
130 135 140 

Glu Ser Asp lie Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp Leu 
14S 150 155 160 

Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu Tyr 
165 170 175 

Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cya Gly Tyr Arg 
180 185 190 

Arg Cys Arg Ala Ser Gly Val 
195 
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<210> 109 
<211> 260 
<212> PRT 

<213> Hepatitis C virus 
<400> 109 

Leu Leu Glu Asp Ser Val Thr Pro lie Asp Thr Thr He Met Ala Lys 
15 10 15 

Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala 
20 25 30 

Arg Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met 
35 40 45 

Ala Leu Tyr Asp Val Val Ser Lys Leu Pro Leu Ala Val Met Gly Ser 
50 55 60 

Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val 
65 70 75 80 
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Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr 
B5 90 95 

Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp He Arg Thr Glu Glu 
100 105 110 

Ala He Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala He 
115 120 125 

Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser 
130 135 140 

Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu 
145 150 155 160 

Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr He Lys Ala Arg Ala 
165 170 175 

Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly 
180 IBS 190 

Asp Asp Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala 
195 200 205 

Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro 
210 215 220 

Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser 
225 230 235 240 

Cys Ser Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val 
245 250 255 

Tyr Tyr Leu Thr 
260 



<210> 110 
<211> 127 
<212> PRT 

<213> Hepatitis C virus 
<400> 110 

Val He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg 
15 10 15 

50 Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro 

20 25 30 

Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn 
35 40 45 

55 
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Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr 
50 55 60 

Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg 
65 70 75 80 

His Thr Pro Val Asn Ser Trp Leu Gly Asn lie He Met Phe Ala Pro 
85 90 95 

Thr Leu Trp Ala Arg Met lie Leu Met Thr His Phe Phe Ser val Leu 
100 105 HO 

He Ala Arg Asp Gin Leu Glu Gin Ala Leu Asn Cys Glu He Tyr 
115 120 125 



<210> 111 
<211> 89 
<212> PRT 

<213> Hepatitis C virus 
<400> 111 

Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr 
15 10 15 

Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg 
20 25 30 

His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met Phe Ala Pro 
35 40 45 

Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe Ser Val Leu 
50 55 60 

He Ala Arg Asp Gin Leu Glu Gin Ala Leu Asn Cys Glu He Tyr Gly 
65 70 75 80 

Ala Cys Tyr Ser He Glu Pro Leu Asp 

85 



<210> 112 
<211> 73 
<212> PRT 

<213> Hepatitis C virus 
<400> 112 

Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu 
15 10 15 

He Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Fro Leu 
20 25 30 
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Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ser 
35 40 45 

Arg Gly Gly Arg Ala Ala lie Cys Gly Lya Tyr Leu Phe Asn Trp Ala 
50 55 60 

Val Arg Thr Lys Leu Lys Leu Thr Pro 
65 70 



*5 <210> 113 

<211;> 63 
<212> PRT 

<213> Hepatitis C virus 

20 <400> 113 

Ser Pro Gly Glu He Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly 
1 " 5 10 15 

Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala 
20 25 30 

25 

Arg Leu Leu Ser Arg Gly Gly Arg Ala Ala lie Cys Gly Lys Tyr Leu 
35 40 45 

Phe Asn Trp Ala val Arg Thr Lys Leu Lys Leu Thr Pro He Ala 
30 50 55 60 



35 



40 



45 



<210> 114 
<211> 310 
*212> DNA 

<2 13 > Hepatitis C virus 
c400> 114 

tgcttgcgag tgccccggga ggtctcgtag accgtgcacc atgagcacga atcctaaacc 60 
tcaaagaaaa accaaacgta acaccaaccg tcgcccacag gacgtcaagt tcccgggtgg 120 
cggtcagatc gttggtggag tttacttgtt gccgcgcagg ggccctagat tgggtgtgcg 180 
cgcgacgagg aagacttccg agcggtcgca acctcgaggt agacgtcagc ctatccccaa 240 
ggcacgtcgg cccgagggca ggacctgggc tcagcccggg tacccttggc ccctctatgg 3 00 
caatgagggt 310 



<210> 115 
<211> 339 
<212> DNA 
50 <213> Hepatitis C virus 

<400> 115 

atgagcacga atcctaaacc tcaaagaaaa accaaacgta acaccaaccg tcgcccacag 60 
gacgtcaagt tcccgggtgg cggtcagatc gttggtggag tttacttgtt gccgcgcagg 120 

55 
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ggccctagat tgggtgtgcg cgcgacgagg 
agacgtcagc ctatccccaa ggcacgtcgg 
tacccttggc ccctctatgg caatgagggt 
cgtggctctc ggcctagctg gggccccaca 



aagacttccg agcggtcgca acctcgaggfc 180 
cccgagggca ggacctgggc tcagcccggg 240 
tgcgggtggg cgggatggct cctgtctccc 300 
gacccccgg 339 



<210> 116 
<211> 345 
<212> DNA 

<213> Hepatitis C vims 
<400> 116 

tgccatcctg cacactccgg ggtgtgtccc 
ttgggtggcg gtgaccccca cggtggceac 
tcgacgtcat atcgatctgc ttgtcgggag 
ggacctgtgc gggtctgtct ttcttgttgg 
ctggacgacg caagactgca attgttctat 
ggcatgggat atgatgatga actggtcccc 



ttgcgttcgc gagggtaacg cctcgaggtg 60 
cagggacggc aaactcccca caacgcagct 120 
cgccaccctc tgctcggccc tctacgtggg 1B0 
tcaactgttt accttctctc ccaggcgcca 2 40 
ctatcccggc catataacgg gtcatcgcat 300 
tacggcagcg ttggt 345 



<210> 117 
<211> 276 
<212> DNA 

<213> Hepatitis C virus 
<400> 117 

cggcgtcgac gcggaaaccc acgtcaccgg gggaaatgcc ggccgcacca cggctgggct 60 
tgttggtctc cttacaccag gcgccaagca gaacatccaa ctgatcaaca ccaacggcag 12 0 
ttggcacatc aatagcacgg ccttgaattg caatgaaagc cttaacaccg gctggttagc 18 0 
agggctcttc tatcaacaca aattcaactc ttcaggctgt cctgagaggt tggccagctg 24 0 
ccgacgcctt aecgattttg cccagggctg gggtcc 2 76 



<210> 118 
<211> 531 
<212> DNA 

<213> Hepatitis C virus 
<400> 118 

ctggggtcct atcagttatg ccaacggaag cggcctcgac gaacgcccct actgctggca 60 
ctaccctcca agaccttgtg gcattgtgcc cgcaaagagc gtgtgtggcc cggtatattg 120 
cttcactccc agccccgtgg tggtgggaac gaccgacagg tcgggcgcgc ctacctacag 18 0 
ctggggtgca aatgatacgg atgtcttcgt ccttaacaac accaggccac cgctgggcaa 24 0 
ttggttcggt tgtacctgga tgaactcaac tggattcacc aaagtgtgcg gagcgccccc 300 
ttgtgtcatc ggaggggtgg gcaacaacac cttgctctgc cccactgatt gcttccgcaa 36 0 
acatccggaa gccacatact ctcggtgcgg ctccggtccc tggattacac ccaggtgcat 42 0 
ggtcgactac ccgtataggc tttggcacta tccttgtacc atcaattaca ccatattcaa 480 
agtcaggatg tacgtgggag gggtcgagca caggctggaa gcggcctgca a 531 



<210> 119 
<211> 289 
<212> DNA 

<213> Hepatitis C virus 
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<400> 119 

ctggcactac cctccaagac cttgtggcat 
atattgcttc actcccagcc ccgtggtggt 
ctacagctgg ggtgcaaatg atacggatgt 
gggcaattgg ttcggttgta cctggatgaa 
gcccccttgt gtcatcggag gggtgggcaa 



tgtgcccgca aagagcgtgt gtggcccggt 60 
gggaacgacc gacaggtcgg gcgcgcctac 120 
cttcgtcctt aacaacacca ggccaccgct 180 
ctcaactgga ttcaccaaag tgtgcggagc 240 
caacaccttg ctctgcccc 28 9 



<210> 120 
<211> 836 
<212> DNA 

<213> Hepatitis C virus 
<400> 120 

gccgcgtgcg gtgacatcat caacggcttg 
ctgcttgggc cagccgacgg aatggtctcc 
gcgtacgccc agcagacgag aggcctccta 
gacaaaaacc aagtggaggg tgaggtccag 
gcaacgtgca tcaatggggt atgctggact 
gcatcaccca agggtcctgt catccagatg 
tggcccgctc ctcaaggttc ccgctcattg 
tacctggtca cgaggcacgc cgatgtcatt 
agcctgcttt cgccccggcc catttcctac 
tgccccgcgg gacacgccgt gggcctattc 
aaagcggtgg actttatccc tgtggagaac 
acggacaact cctctccacc agcagtgccc 
cccaccggca gcggtaagag caccaaggtc 
gtgttggtgc tcaacccctc tgttgctgca 



cccgtctctg cccgtagggg ccaggagara 60 
aaggggtgga ggttgctggc gcccatcacg 12 0 
gggtgtataa tcaccagcct gactggccgg 18 0 
atcgtgtcaa ctgctaccca aaccttcctg 240 
gtctaccacg gggccggaac gaggaccatc 3 00 
tataccaatg tggaccaaga ccttgtgggc 360 
acaccctgta cctgcggctc ctcggacctt 420 
cccgtgcgcc ggcgaggtga tagcaggggt 480 
ttgaaaggct cctcgggggg tccgctgttg 54 0 
agggccgcgg tgtgcacccg tggagtggct 600 
ctagggacaa ccatgagatc cccggtgttc 660 
cagagcttcc aggtggccca cctgcatgct 720 
ccggctgcgt acgcagccca gggctacaag 780 
acgctgggct ttggtgctta catgtc 83 6 



<210> 121 
<2ll> 475 
<212> DNA 

<213> Hepatitis C virus 
<400> 121 

gcgccggcga ggtgatagca ggggtagcct 
aggctcctcg gggggtccgc tgttgtgcec 
cgcggtgtgc acccgtggag tggctaaagc 
gacaaccatg agatccccgg tgttcacgga 
cttccaggtg gcccacctgc atgctcccac 
tgcgtacgca gcccagggct acaaggtgtt 
gggctttggt gcttacatgt ccaaggccca 
gagaacaatt accactggca gccccatcac 



gctttcgccc cggcccattt cctacttgaa 60 
cgcgggacac gccgtgggcc tattcagggc 120 
ggtggacttt atccctgtgg agaacctagg 18 0 
caactcctct ccaccagcag tgccccagag 240 
cggcagcggt aagagcacca aggtcccggc 3 00 
ggtgctcaac ccctctgttg ctgcaacgct 3 60 
tggggttgat cctaatatca ggaccggggt 420 
gtactccacc tacggcaagt tcctt 475 



<210> 122 
<21l5 790 
<212> DNA 

<213> Hepatitis C virus 
<400> 122 

tgatagcagg ggtagcctgc tttcgccccg gcccatttcc tacttgaaag gctcctcggg 60 
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gggtccgctg ttgtgccccg cgggacacgc 
ccgtggagtg gctaaagcgg tggactttat 
atccccggtg ttcacggaca actcctctcc 
ccacctgcat gctcccaccg gcagcggtaa 
ccagggctac aaggtgttgg tgctcaaccc 
ttacatgtcc aaggcccatg gggttgatcc 
cactggcagc cccatcacgt actccaccta 
aggaggtgct tatgacataa taatttgtga 
cttgggcatc ggcactgtcc ttgaccaagc 
cgccactgct acccctccgg gctccgtcac 
tctgtccacc accggagaga tcccctttta 
ggggggaaga catctcatct tctgccactc 
gctggtcgca 



cgtgggccta ttcagggccg cggtgtgcac 120 
ccctgtggag aacctaggga caaccatgag 180 
accagcagtg ccccagagct tccaggtggc 240 
gagcaccaag gtcccggctg cgtacgcagc 3 00 
ctctgtfcgct gcaacgctgg gctttggtgc 3 60 
taatatcagg accggggtga gaacaattac 420 
cggcaagttc cttgccgacg gcgggtgctc 480 
cgagtgccac tccacggatg ccacatccat 540 
agagactgcg ggggcgagac tggttgtgct 600 
tgtgtcccat cctaacatcg aggaggttgc 660 
cggcaaggct atccccctcg aggtgatcaa 720 
aaagaagaag tgcgacgagc tcgccgcgaa 780 

790 



<210> 123 
<211> 583 
<212> DNA 

<213> Hepatitis C virus 
<400> 123 

ggacaactcc tctccaccag cagtgcccca 
caccggcagc ggtaagagca ccaaggtccc 
gttggtgctc aacccctctg ttgctgcaac 
ccatggggtt gatcctaata tcaggaccgg 
cacgtactcc acctacggca agttccttgc 
cataataatt tgtgacgagt gccactccac 
tgtccttgac caagcagaga ctgcgggggc 
tccgggctcc gtcactgtgt cccatcctaa 
agagatcccc ttttacggca aggctatccc 
catcttctgc cactcaaaga agaagtgcga 



gagcttccag gtggcccacc tgcatgctcc 60 
ggctgcgtac gcagcccagg gctacaaggt 120 
gctgggcttt ggtgcttaca tgtccaaggc 180 
ggtgagaaca attaccactg gcagccccat 240 
cgacggcggg tgctcaggag gtgcttatga 300 
ggatgccaca tccatcttgg gcatcggcac 360 
gagactggtt gtgctcgcca ctgctacccc 420 
catcgaggag gttgctctgt ccaccaccgg 480 
cctcgaggtg atcaaggggg gaagacatct 540 
cgagctcgcc gcg 5 83 



c210> 124 
c211> 617 
<212> DNA 

<213> Hepatitis C virus 
<400> 124 

ccttgccgac ggcgggtgct caggaggtgc 
ctccacggat gccacatcca tcttgggcat 
gggg^cgaga ctggt tgt gc tcgccactgc 
tcctaacatc gaggaggttg ctctgtccac 
tatccccctc gaggtgatca aggggggaag 
gtgcgacgag ctcgccgcga agctggtcgc 
cggtcttgac gtgtctgtca tcccgaccag 
tctcatgact ggctttaccg gcgacttcga 
tcagacagtc gatttcagcc ttgaccctac 
ggatgctgrtc tccaggactc aacgccgggg 
tagatttgtg gcaccgg 



ttatgacata ataatttgtg acgagtgcca 60 
cggcactgtc cttgaccaag cagagactgc 120 
tacccctccg ggctccgtca ctgtgtccca 180 
caccggagag atcccctttt acggcaaggc 240 
acatctcatc ttctgccact caaagaagaa 300 
attgggcatc aatgccgtgg cctactaccg 360 
eggcgatgtt gt eg teg tgt cgaccgatgc 420 
ctctgtgata gaetgeaaca cgtgtgtcac 480 
ctttaccatt gagacaacca cgctccccca 540 
caggactggc agggggaagc caggcatcta 600 

617 



<210> 125 
<211> 559 
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<212> DNA 

<213> Hepatitis C virus 
<400> 125 

ctccacggat gccacatcca tcttgggcat 
gggggcgaga ctggttgtgc tcgccactgc 
tcctaacatc gaggaggttg ctctgtccac 
tatccccctc gaggtgatca aggggggaag 
gtgcgacgag ctcgccgcga agctggtcgc 
cggtcttgac gtgtctgtca tcccgaccag 
tctcatgact ggctttaccg gcgacttcga 
tcagacagtc gatttcagcc ttgaccctac 
ggatgctgtc tccaggactc aacgccgggg 
tagatttgtg gcaccgggg 



cggcactgtc cttgaccaag cagagactgc €0 
tacccctccg ggctccgtca ctgtgtccca 120 
caccggagag atcccctttt acggcaaggc 180 
acatctcatc ttctgccact caaagaagaa 240 
attgggcatc aatgccgtgg cctactaccg 3 00 
cggcgatgtt gtcgtcgtgt cgaccgatgc 3 60 
ctctgtgata gactgcaaca cgtgtgtcac 42 0 
ctttaccatt gagacaacca cgctccccca 48 0 
caggactggc agggggaagc caggcatcta 540 

559 



<210> 126 
<211> 475 
<212> DNA 

<213> Hepatitis C virus 



<400> 126 

tgtgatagac tgcaacacgt gtgtcactca 
taccattgag acaaccacgc tcccccagga 
gactggcagg gggaagccag gcatctatag 
catgttcgac tcgtccgtce tctgtgagtg 
cacgcccgcc gagactacag ttaggctacg 
gtgccaggac catcttgaat tttgggaggg 
ccacttttta tcccagacaa agcagagtgg 
agccaccgtg tgcgctaggg ctcaagcccc 



gacagtcgat ttcagccttg accctacctt 60 
tgctgtctcc aggactcaac gccggggcag 120 
atttgtggca ccgggggagc gcccctccgg 180 
ctatgacgcg ggctgtgctt ggtatgagct 240 
agcgtacatg aacaccccgg ggcttcccgt 300 
cgtctttacg ggcctcactc atatagatgc 360 
ggagaacttt ccttacctgg tagcgtacca 420 
tcccccatcg tgggaccaga tgtgrg 475 



<210> 127 
<211> 390 
<212> DNA 

<213> Hepatitis C virus 



<400> 127 

tagatttgtg gcaccggggg agcgcccctc 
gtgctatgac gcgggctgtg cttggtatga 
acgagcgtac atgaacaccc cggggcttcc 
gggcgtcttt acgggcctca ctcatataga 
tggggagaac tttccttacc tggtagcgta 
ccctccccca tcgtgggacc agatgtggaa 
tgggccaaca cccctgctat acagactggg 



cggcatgttc gactcgtccg tcgtctgtga 6 0 
gctcacgccc gccgagacta cagttaggct 120 
cgtgtgccag gaccatcttg aattttggga 18 0 
tgcccacttt ttatcccaga caaagcagag 240 
ccaagccacc gtgtgcgcta gggctcaagc 3 00 
gtgtttgatc cgccttaaac ccaccctcca 36 0 

390 



<210> 128 
<211> 155 
<212> DNA 

<213> Hepatitis C virus 



<400> 128 

acgagcacct gggtgctcgt tggcggcgtc ctggctgctc tggccgcgta ttgcctgtca 6 0 
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10 



15 



25 



40 



50 



55 



acaggctgcg tggtcatagt gggcaggatc gtcttgtccg ggaagccggc aattatacct 120 
gacagggagg ttctctacca ggagttcgat gagat 155 



<210> 129 
<211> 56 
<212> DNA 

<213> Hepatitis C virus 
<400> 129 

ggctgctctg gccgcgtatt gcctgtcaac aggctgcgtg gtcatagtgg gcagga 56 



<210> 130 
<211> 625 
<212> DNA 
20 <213> Hepatitis C virus 

<400> 130 

rtttacagct gccgtcacca gcccactaac cactggccaa accctcctct tcaacatatt 60 

gsrgggggtgg gtggctgccc agctcgccgc ccccggtgcc gctactgcct ttgtgggtgc 120 

tggcctaget ggcgccgcca tcggcagcgt tggactgggg aaggtcctcg tggacattct 18 0 

tgcagggtat ggcgcgggcg tggcgggagc tcttgtagca ttcaagatca tgagcggtga 24 0 

ggtcccctcc acggaggacc tggtcaatct gctgcccgcc atcctctcgc ctggagccct 3 00 

tgtagtcggt gtggtctgcg cagcaatact gcgccggcac gttggcccgg gcgagggggc 360 

agtgcaatgg atgaaccggc taatagcctt cgcctcccgg gggaaccatg tttcccccac 420 

gcactacgtg ccggagagcg atgcagccgc ccgcgtcact gccatactca gcagcctcac 480 

30 tgtaacccag ctectgaggc gactgcatca gtggataagc tcggagtgta ccactccatg 540 

ctccggttcc tggctaaggg acatctggga ctggatatgc gaggtgctga gcgactttaa 600 

gacctggctg aaagccaagc tcatg 625 

35 <210> 131 

<211> 623 

<212> DNA 

<213> Hepatitis C virus 
<400> 131 

tactgccttt gtgggtgctg gcctagctgg cgccgccatc ggcagcgttg gactggggaa 60 

ggtcctcgtg gacattcttg cagggtatgg cgcgggcgtg gcgggagctc ttgtagcatt 120 

caagatcatg agcggtgagg tcccctccac ggaggacctg gtcaatctgc tgcccgccat 180 

cctctcgcct ggagcccttg tagtcggtgt ggtctgcgca gcaatactgc gccggcacgt 240 

tggcccgggc gagggggcag tgcaatggat gaaccggcta atagccttcg cctcccgggg 3 00 

45 gaaccatgtt tcccccacgc actacgtgcc ggagagcgat gcagccgccc gcgtcactgc 360 

catactcagc agcctcactg taacccagct cctgaggcga ctgcatcagt ggataagctc 420 

ggagtgtacc actccatgct ccggttcctg gctaagggac atctgggact ggatatgcga 48 0 

ggtgctgagc gactttaaga cctggctgaa agccaagctc atgccacaac tgcctgggat 540 

tccctttgtg tcctgccagc gcgggtatag gggggtctgg cgaggagacg gcattatgca 600 
cactcgctgc cactgtggag ctg 623 



<=210> 132 
<211> 678 
<212> DNA 
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<213> Hepatitis C virus 

5 

<400> 132 

cctcgtggac attcttgcag ggtatggcgc 
gatcatgagc ggtgaggtcc cctccacgga 
ctcgcctgga gcccttgtag tcggtgtggt 

10 cccgggcgag ggggcagtgc aatggatgaa 

ccatgtttcc cccacgcact acgtgccgga 
actcagcagc ctcactgtaa cccagctcct 
gtgtaccact ccatgctccg gttcctggct 
gctgagcgac tttaagacct ggctgaaagc 

15 ctttgtgtcc tgccagcgcg ggtatagggg 

tcgctgccac tgtggagctg agatcactgg 
cggtcctagg acctgcagga acatgtggag 
gggcccctgt actcccct 



gggcgtggcg ggagctcttg tagcattcaa 60 
ggacctggtc aatctgctgc ccgccatcct 120 
ctgcgcagca atactgcgcc ggcacgttgg 180 
ccggctaata gccttcgcct cccgggggaa 240 
gagcgatgca gccgcccgcg tcactgccat 300 
gaggcgactg catcagtgga taagctcgga 360 
aagggacatc tgggactgga tatgcgaggt 420 
caagctcatg ccacaactgc ctgggattcc 480 
ggtctggcga ggagacggca ttatgcacac 540 
acatgtcaaa aacgggacga tgaggatcgt 600 
tgggacgttc cccattaacg cctacaccac 660 

678 



<210> 133 
<2ll> 436 
<212> DNA 

<213> Hepatitis C virus 



25 <400> 133 

tgcagggtat ggcgcgggcg tggcgggagc 
ggtcccctcc acggaggacc tggtcaatct 
tgtagtcggt gtggtctgcg cagcaatact 
agtgcaatgg atgaaccggc taatagcctt 

3 0 gcactacgtg ccggagagcg atgcagccgc 

tgtaacccag ctcctgaggc gactgcatca 
ctccggttcc tggctaaggg acatctggga 
gacctggctg aaagcc 



tcttgtagca ttcaagatca tgagcggtga 60 
gctgcccgcc atcctctcgc ctggagccct 120 
gcgccggcac gttggcccgg gcgagggggc 180 
cgcctcccgg gggaaccatg tttcccccac 240 
ccgcgtcact gccatactca gcagcctcac 300 
gtggataagc tcggagtgta ccactccatg 360 
ctggatatgc gaggtgctga gcgactttaa 420 

436 



c210> 134 
<211> 164 
<212> DNA 

<213> Hepatitis C virus 



40 <400> 134 

agcccttgta gtcggtgtgg tctgcgcagc 
gggggcagtg caatggatga accggctaat 
ccccacgcac tacgtgccgg agagcgatgc 



aatactgcgc cggcacgttg gcccgggcga 6 0 
agccttcgcc tcccggggga accatgtttc 120 
agccgcccgc gtca 164 



<210> 135 
<211> 496 
<212> DNA 

c2l3> Hepatitis C virus 
c400> 135 

cgcctcccgg gggaaccatg tttcccccac 
ccgcgtcact gccatactca gcagcctcac 
gtggataagc tcggagtgta ccactccatg 
ctggatatgc gaggtgctga gcgactttaa 



gcactacgtg ccggagagcg atgcagccgc 60 
tgtaacccag ctcctgaggc gactgcatca 120 
ctccggttcc tggctaaggg acatctggga 180 
gacctggctg aaagccaagc tcatgccaca 240 
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actgcctggg attccctttg tgtcctgcca 
eggcattatg cacactcgct gccactgtgg 
gacgatgagg atcgtcggtc ctaggacctg 
taacgccfcac accacgggcc cctgtactcc 
gtggagggtg tctgca 



gcgcgggtat aggggggtct ggcgaggaga 3 00 
agctgagatc actggacatg tcaaaaacgg 3 60 
caggaacatg tggagtggga cgttccccat 420 
ccttcctgcg ccgaactata agttcgcgct 480 

496 



<210> 136 
<211> 926 
<212> DNA 

<213> Hepatitis C virus 
<400> 136 

tacgtgccgg agagcgatgc agccgcccgc 
acccagctcc tgaggcgact gcatcagtgg 
ggttcctggc taagggacat ctgggactgg 
tggctgaaag ccaagctcat gccacaactg 
gggtataggg gggtctggcg aggagacggc 
gagatcactg gacatgtcaa aaacgggacg 
aacatgtgga gtgggacgtt ccccattaac 
cctgcgccga actataagtt cgcgctgtgg 
aggcgggtgg gggacttcca ctacgtatcg 
tgccagatcc catcgcccga atttttcaca 
gcgccccctt gcaagccctt gctgcgggag 
tacccggtgg ggtcgcaatt accttgcgag 
atgctcactg atccctccca fcataacagca 
tcaccccctt ctatggccag ctcctcggct 
acttgcaccg ccaaccatga ctcccctgac 
aggcaggaga tgggcggcaa catcac 



gtcactgcca tactcagcag cctcactgta 60 
ataagctcgg agtgtaccac tccatgctcc 12 0 
atatgcgagg tgctgagcga ctttaagacc 18 0 
cctgggattc cctttgtgtc ctgccagcgc 24 0 
attatgcaca ctcgctgcca ctgtggagct 300 
atgaggatcg tcggtcctag gacctgcagg 360 
gcctacacca cgggcccctg tactcccctt 420 
agggtgtctg cagaggaata cgtggagata 480 
ggtatgacta ctgacaatct taaatgcccg 54 0 
gaattggacg gggtgcgcct acacaggttt 600 
gaggtatcat tcagagtagg actccacgag 660 
cccgaaccgg acgtagccgt gttgacgtcc 72 0 
gaggcggccg ggagaaggtt ggcgagaggg 780 
agccagctgt ccgctccatc tctcaaggca 840 
gccgagctca tagaggctaa cctcctgtgg 900 

926 



<210> 137 
<211> 850 
<212> DNA 

<213> Hepatitis C virus 
<400> 137 

actcagcagc ctcactgtaa cccagctcct 
gtgtaccact ccatgctccg gttcctggct 
gctgagcgac tttaagacct ggctgaaagc 
ctttgtgtcc tgccagcgcg ggtatagggg 
tcgctgccac tgtggagctg agatcactgg 
cggtcctagg acctgcagga acatgtggag 
gggcccctgt actccccttc ctgcgccgaa 
agaggaatac gtggagataa ggcgggtggg 
tgacaatctt aaatgcccgt gccagatccc 
ggtgcgccta cacaggtttg cgcccccttg 
cagagtagga ctccacgagt acccggtggg 
cgtagccgtg ttgacgtcca tgctcactga 
gagaaggttg gcgagagggt cacccccttc 
cgctccatct ctcaaggcaa cttgcaccgc 
agaggctaac 



gaggcgactg catcagtgga taagctcgga 60 
aagggacatc tgggactgga tatgcgaggt 120 
caagctcatg ccacaactgc ctgggattcc 180 
ggtctggcga ggagacggca ttatgcacac 240 
acatgtcaaa aacgggacga tgaggatcgt 3O0 
tgggacgttc cccattaacg cctacaccac 360 
ctataagttc gcgctgtgga gggtgtctgc 420 
ggacttccac tacgtatcgg gtatgactac 48 0 
atcgcccgaa tttttcacag aattggacgg 540 
caagcccttg ctgcgggagg aggtatcatt 600 
gtcgcaatta ccttgcgagc ccgaaccgga 660 
tccctcccat ataacagcag aggcggccgg 720 
tatggccagc tcctcggcta gccagctgtc 780 
caaccatgac tcccctgacg ccgagctcat 840 

850 
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<210> 138 
<211> 749 
<212> DNA 

<213> Hepatitis C virus 
<400> 138 

cagcctcact gtaacccagc tcctgaggcg 
cactccatgc tccggttcct ggctaaggga 
cgactttaag acctggctga aagccaagct 
gtcctgccag cgcgggtata ggggggtctg 
ccactgtgga gctgagatca ctggacatgt 
taggacctgc aggaacatgt ggagtgggac 
ctgtactccc cttcctgcgc cgaactataa 
atacgtggag ataaggcggg tgggggactt 
tcttaaatgc ccgtgccaga tcccatcgcc 
cctacacagg tttgcgcccc cttgcaagcc 
aggactccac gagtacccgg tggggtcgca 
cgtgttgacg tccatgctca ctgatccctc 
gttggcgaga gggtcacccc cttctatgg 



actgcatcag tggataagct cggagtgtac 60 
catctgggac tggatatgcg aggtgctgag 12 0 
catgccacaa ctgcctggga ttccctttgt 180 
gcgaggagac ggcattatgc acactcgctg 240 
caaaaacggg acgatgagga tcgtcggtcc 300 
gttccccatt aacgcctaca ccacgggccc 360 
gttcgcgctg tggagggtgt ctgcagagga 42 0 
ccactacgta tcgggtatga ctactgacaa 48 0 
cgaatttttc acagaattgg acggggtgcg 54 0 
cttgctgcgg gaggaggtat cattcagagt 600 
attaccttgc gagcccgaac cggacgtagc 660 
ccatataaca gcagaggcgg ccgggagaag 720 

749 



<210> 139 
<211> 257 
<212> DNA 

<213> Hepatitis C virus 
<400> 139 

gacctggctg aaagccaagc tcatgccaca 
gcgcgggtat aggggggtct ggcgaggaga 
agctgagatc actggacatg tcaaaaacgg 
caggaacatg tggagtggga cgttccccat 
ccttcctgcg ccgaact 



actgcctggg attccctttg tgtcctgcca 60 
cggcattatg cacactcgct gccactgtgg 120 
gacgatgagg atcgtcggtc ctaggacctg 180 
taacgcctac accacgggcc cctgtactcc 240 

257 



<210> 140 
<211> 285 
<212> DNA 

<213> Hepatitis C virus 
<400> 140 

tgagatcact ggacatgtca aaaacgggac 
gaacatgtgg agtgggacgt tccccattaa 
tcctgcgccg aactataagt tcgcgctgtg 
aaggcgggtg ggggacttcc actacgtatc 
gtgccagatc ccatcgcccg aatttttcac 



gatgaggatc gtcggtccta ggacctgcag 60 
cgcctacacc acgggcccct gtactcccct 120 
gagggtgtct gcagaggaat acgtggagat 180 
gggtatgact actgacaatc ttaaatgccc 240 
agaattggac ggggt 285 



<210> 141 
<211> 228 
<212> DNA 

<213> Hepatitis C virus 
<400> 141 

catagaggct aacctcctgt ggaggcagga gatgggcggc aacatcacca gggttgagtc 60 
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agagaacaaa gtggtgattc tggactcctt cgatccgctt gtggcagagg aggatgagcg 12 0 
ggaggtctcc gtacctgcag aaattctgcg gaagtctcgg agattcgccc gggccctgcc 18 0 
cgtctgggcg cggccggact acaacccccc gctagtagag acgtggaa 228 



<210> 142 
<211> 273 
<212> DNA 

<213> Hepatitis C virus 
<400> 142 

ccatggctgc ccgctaccac ctccacggtc 
tacggtggtc ctcaccgaat caaccctatc 
ttttggcagc tcctcaactt ccggcattac 
cgccccttct ggctgccccc ccgactccga 
ggagggggag cctggggatc cggatctcag 



ccctcctgtg cctccgcctc ggaaaaagcg 60 
tactgccttg gccgagcttg ccaccaaaag 120 
gggcgacaat acgacaacat cctctgagcc 180 
cgttgagtcc tattcttcca tgccccccct 240 
cga 273 



<210> 143 
<211> 412 
<212> DMA 

<213> Hepatitis C virus 
<400> 143 

ttcctggaca ggcgcactcg tcaccccgtg 
cgcactgagc aactcgttgc tacgccatca 
tgcttgccaa aggcagaaga aagtcacatt 
ccaggacgtg ctcaaggagg tcaaagcagc 
cgtagaggaa gcttgcagcc tgacgccccc 
ggcaaaagac gtccgttgcc atgccagaaa 
agaccttctg gaagacagtg taacaccaat 



cgctgcggaa gaacaaaaac tgcccatcaa 60 
caatctggtg tattccacca cttcacgcag 120 
tgacagactg caagttctgg acagecatta 180 
ggcgtcaaaa gtgaaggcta acttgctatc 240 
acattcagcc aaatccaagt ttggctatgg 3 00 
ggccgtagcc cacatcaact ccgtgtggaa 360 
agacactacc atcatggcca ag 412 



<210> 144 
<211> 903 
<212> DNA 

<213> Hepatitis C virus 
<400:> 144 

ggctaacttg ctatccgtag aggaagcttg 
caagtttggc tatggggcaa aagacgtccg 
caactccgtg tggaaagacc ttctggaaga 
ggccaagaac gaggttttct gcgttcagcc 
catcgtgttc cccgacctgg gcgtgcgcgt 
tagcaagctc cccctggccg tgatgggaag 
gcgggttgaa ttcctcgtgc aagcgfcggaa 
tgatacccgc tgttttgact ccacagtcac 
ttaccaatgt tgtgacctgg acccccaagc 
gctttatgtt . gggggccctc ttaccaattc 
ccgcgcgagc ggcgtactga caactagctg 
ccgggcagcc tgtcgagccg cagggctcca 
cttagtcgtt atctgtgaaa gtgcgggggt 
cacggaggct atgaccaggt actccgcccc 
cttggagctt ataacatcat gctcctccaa 



cagcctgacg cccccacatt cagccaaatc 60 
ttgccatgcc agaaaggccg tagcccacat 120 
cagtgtaaca ccaatagaca ctaccatcat 18 0 
tgagaagggg ggtcgtaagc cagctcgtct 240 
gtgcgagaag atggccctgt acgacgtggt 300 
ctcctacgga ttccaatact caccaggaca 360 
gtccaagaag accccgatgg ggttctcgta 42 0 
tgagagcgac atccgtacgg aggaggcaat 480 
ccgcgtggcc atcaagtccc tcactgagag 54 0 
aaggggggaa aactgcggct accgcaggtg 600 
tggtaacacc ctcacttgct acatcaaggc 660 
ggactgcacc atgctcgtgt gtggcgacga 720 
ccaggaggac gcggcgagcc tgagagcctt 780 
ccccggggac cccccacaac cagaatacga 840 
cgtgtcagtc gcccacgacg gcgctggaaa 900 
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gag 903 



<210> 145 
<211> 600 
<212> DNA 

<213> Hepatitis C virus 
<400> 145 

agaggaagct tgcagcctga cgcccccaca ttcagccaaa tccaagtfctg gctatggggc 60 
aaaagacgtc cgttgccatg ccagaaaggc cgtagcccac atcaactccg tgtggaaaga 120 
ccttctggaa gacagtgtaa caccaataga cactaccatc atggccaaga acgaggtttt 180 
ctgcgttcag cctgagaagg ggggtcgtaa gccagctcgt ctcatcgtgt tccccgacct 240 
gggcgtgcgc gtgtgcgaga' agatggccct gtacgacgtg gttagcaagc tccccctggc 3 00 
cgtgatggga agctcctacg gattccaata ctcaccagga cagcgggttg aattcctcgt 360 
gcaagcgtgg aagtccaaga agaccccgat ggggttctcg tatgataccc gctgttttga 420 
ctccacagtc actgagagcg acatccgtac ggaggaggca atttaccaat gttgtgacct 480 
ggacccccaa gcccgcgtgg ccatcaagtc cctcactgag aggctttatg ttgggggccc 540 
tcttaccaat tcaagggggg aaaactgcgg ctaccgcagg tgccgcgcga gcggcgtact 600 



<210> 146 
<211> 781 
<212> DNA 

<213> Hepatitis C virus 
<400> 146 

ccttctggaa gacagtgtaa caccaataga cactaccatc atggccaaga acgaggtttt 60 
ctgcgttcag cctgagaagg ggggtcgtaa gccagctcgt ctcatcgtgt tccccgacct 120 
gggcgtgcgc gtgtgcgaga agatggccct gtacgacgtg gttagcaagc tccccctggc 180 
cgtgatggga agctcctacg gattccaata ctcaccagga cagcgggttg aattcctcgt 240 
gcaagcgtgg aagtccaaga agaccccgat ggggttctcg tatgataccc gctgttttga 3 00 
ctccacagtc actgagagcg acatccgtac ggaggaggca atttaccaat gttgtgacct 360 
ggacccccaa gcccgcgtgg ccatcaagtc cctcactgag aggctttatg ttgggggccc 420 
tcttaccaat tcaagggggg aaaactgcgg ctaccgcagg tgccgcgcga gcggcgtact 4 80 
gacaactagc tgtggtaaca ccctcacttg ctacatcaag gcccgggcag cctgtcgagc 540 
cgcagggctc caggactgca ccatgctcgt gtgtggcgac gacttagtcg ttatctgtga 600 
aagtgcgggg gtccaggagg acgcggcgag cctgagagcc ttcacggagg ctatgaccag 6 60 
gtactccgcc ccccccgggg accccccaca accagaatac gacttggagc ttataacatc 720 
atgctcctcc aacgtgtcag tcgcccacga cggcgctgga aagagggtct actaccttac 7 80 
c ~ ~ 781 



<210> 147 
<211> 382 
<212> DNA 

<213> Hepatitis C virus 
<400> 147 

cgttatctgt gaaagtgcgg gggtccagga ggacgcggcg agcctgagag ccttcacgga 60 
ggctatgacc aggtactccg ccccccccgg ggacccccca caaccagaat acgacttgga 120 
gcttataaca tcatgctcct ccaacgtgtc agtcgcccac gacggcgctg gaaagagggt 180 
ctactacctt acccgtgacc ctacaacccc cctcgcgaga gccgcgtggg agacagcaag 240 
acacactcca gtcaattcct ggctaggcaa cataatcatg tttgccccca cactgtgggc 3 00 
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gaggatgata ctgatgaccc atttctttag cgtcctcata gccagggatc agcttgaaca 360 
ggctcttaac tgtgagatct ac 382 



<210> 148 
<211> 268 
<212> DNA 

<213> Hepatitis C virus 
<400> 148 

cgtgtcagtc gcccacgacg gcgctggaaa gagggtctac taccttaccc gtgaccctac 60 
aacccccctc gcgagagccg cgtgggagac agcaagacac actccagtca attcctggct 12 0 
aggcaacata atcatgtttg cccccacact gtgggcgagg atgatactga tgacccattt 180 
ctttagcgtc ctcatagcca gggatcagct tgaacaggct cttaactgtg agatctacgg 240 
agcctgctac tccatagaac cactggat 268 



<210> 149 
<211> 222 
<212> DNA 

<213> Hepatitis C virus 
<400> 149 

actccatggc ctcagcgcat tttcactcca cagttactct ccaggtgaaa tcaatagggt 60 
ggccgcatgc ctcagaaaac ttggggtccc gcccttgcga gcttggagac accgggcccg 120 
gagcgtccgc gctaggcttc tgtccagagg aggcagggct gccatatgtg gcaagtacct 180 
cttcaactgg gcagtaagaa caaagctcaa actcactcca at 222 



<210> 150 
<211> 192 
<212> DNA 

<213> Hepatitis C virus 
<400> 150 

ctctecaggt gaaatcaata gggtggccgc atgcctcaga aaacttgggg tcccgccctt 60 
gcgagcttgg agacaccggg cccggagcgt ccgcgctagg cttctgtcca gaggaggcag 120 
ggctgccata tgtggcaagt acctcttcaa ctgggcagta agaacaaagc tcaaactcac 1B0 
tccaatagcg gc 192 



<210> 151 

<211> 10 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: linker 
sequence 

<400> 151 

gggccacgaa 10 
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<210> 152 
5 <211> 13 

<212> DNA 

<213> Artificial Sequence 
<220> 

10 <223> Description of Artificial Sequence: linker 

sequence 

<400> 152 

ttcgtggccc ctg 13 

<210> 153 
<211> 138 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: pP6 vector 
sequence 

<400> 153 

ctagccatgg ccgcaggggc cgcggccgca ctagtgggga tccttaatta aagggccact 60 

ggggcccccc gtaccggcgt ccccggcgcc ggcgtgatca cccctaggaa ttaatttccc 120 
ggtgaccccg ggggagct 138 

<210> 154 
<211> 128 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: pB5 vector 
sequence 

40 <400> 154 

catggccgca ggggccgcgg ccgcactagt ggggatcctt aattaaaggg ccactggggc 60 
cccccggcgt ccccggcgcc ggcgtgatca cccctaggaa ttaatttccc ggtgaccccg 120 

ggggagct 12 b 



15 



20 



25 



30 



35 



45 



50 



55 



<210> 155 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Description of Artificial sequence: primer 
<400> 155 

gcgtttggaa tcactacagg 20 
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<210> 156 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
<400> 156 

cacgatgcac gttgaagtg 



Sequence : primer 

19 



Claims 

20 

1. A nucleic acid which encodes a polypeptide selected from the group consisting of the amino acid sequences 
SEQ ID N° 1 to 38 or a variant thereof, and a sequence complementary thereto. 

2. A nucleic acid according to claim 1 which encodes a polypeptide having at least 95% aminoacid identity with a 
25 polypeptide selected from the group consisting of the aminoacid sequences SEQ ID N°1 to 38, and a sequence 

complementary thereto. 

3. A nucleic acid according to claim 1 which is selected from the group consisting of the sequences SEQ ID N°39 
to 76 or a sequence complementary thereto. 

30 

4. A nucleic acid according to claim 1 which possesses at least 95% nucleic acid identity with a nucleic acid selected 
from the group consisting of the sequences SEQ ID N°39 to 76. 

5. A nucleic acid according to claim 1 encoding a polypeptide having an aminoacid sequence selected from the 
35 group consisting of the sequences consisting of at least: 

45 consecutive aminoacids of SEQ ID N°1; 
30 consecutive aminoacidss of SEQ ID N°2; 
65 consecutive aminoacids of SEQ ID N°3; 
40 - 30 consecutive aminoacids of SEQ ID N°4; 

130 consecutive aminoacids of SEQ ID N°5; 
^ 25 consecutive aminoacids of SEQ ID N°6; 

23 consecutive aminoacids of SEQ ID N°7. 
48 consecutive aminoacids of SEQ ID N°8; 

45 - 36 consecutive aminoacids of SEQ ID N°9; 

25 consecutive aminoacids of SEQ ID N°10; 

24 consecutive aminoacids of SEQ ID N°11; 
37 consecutive aminoacids of SEQ ID N°12; 

25 consecutive aminoacids of SEQ ID N°13; 
50 - 30 consecutive aminoacids of SEQ ID N°14; 

27 consecutive aminoacids of SEQ ID N°15; 
69 consecutive aminoacids of SEQ ID N°16; 
130 consecutive aminoacids of SEQ ID N°17; 
33 consecutive aminoacids of SEQ ID N°18; 
55 - 25 consecutive aminoacids of SEQ ID N°19; 

40 consecutive aminoacids of SEQ ID N°20; 
78 consecutive aminoacids of SEQ ID N°21 ; 
39 consecutive aminoacids of SEQ ID N°22; 
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57 consecutive aminoacids of SEQ ID N°23; 

26 consecutive aminoacids of SEQ ID N°24; 

68 consecutive aminoacids of SEQ ID N°25; 
34 consecutive aminoacids of SEQ ID N°26; 

5 42 consecutive aminoacids of SEQ ID N°27; 

48 consecutive aminoacids of SEQ ID N°28. 
102 consecutive aminoacids of SEQ ID N°29: 

49 consecutive aminoacids of SEQ ID N°30: 
92 consecutive aminoacids of SEQ ID N° 31; 

10 - 49 consecutive aminoacids of SEQ ID N°30; 

92 consecutive aminoacids of SEQ ID N°31 ; 
71 consecutive aminoacids of SEQ ID N°32; 
55 consecutive aminoacids of SEQ ID N°33; 

69 consecutive aminoacids of SEQ ID N°34; 
15 - 23 consecutive aminoacids of SEQ ID N°35; 

33 consecutive aminoacids of SEQ ID N°36; 

32 consecutive aminoacids of SEQ ID N°37; 

and 

20 

22 consecutive aminoacids of SEQ ID N°38. 

6. A nucleic acid according to claim 1 encoding a polypeptide having an amino acid sequence comprising from 
one to three substitutions, additions or deletions of one amino acid as regards a polypeptide selected from the 

25 group consisting of the amino acid sequences SEQ ID N°1 to 38 or as regards a polypeptide according to claim 

5 and a sequence complementary thereto. 

7. A polypeptide selected from the group consisting of the amino acid sequences SEQ ID N°1 to 38 or a variant 
thereof. 

30 

8. A polypeptide according to claim 7 having at least 95% aminoacid identity with a polypeptide selected from the 
group consisting of the aminoacid sequences SEQ ID N°1 to 38. 

9. A polypeptide according to claim 7 having an aminoacid sequence selected from the group consisting of the 
35 sequences consisting of at least: 

45 consecutive aminoacids of SEQ ID N°1; 
30 consecutive aminoacidss of SEQ ID N°2; 
65 consecutive aminoacids of SEQ ID N°3; 
4 o - 30 consecutive aminoacids of SEQ ID N°4; 

130 consecutive aminoacids of SEQ ID N°5; 
25 consecutive aminoacids of SEQ ID N°6; 

23 consecutive aminoacids of SEQ ID N°7. 
48 consecutive aminoacids of SEQ ID N°8; 

45 - 36 consecutive aminoacids of SEQ ID N°9; 

25 consecutive aminoacids of SEQ ID N°10; 

24 consecutive aminoacids of SEQ ID N°11; 
37 consecutive aminoacids of SEQ ID N°12; 

25 consecutive aminoacids of SEQ ID N°13; 
50 - 30 consecutive aminoacids of SEQ ID N°14; 

27 consecutive aminoacids of SEQ ID N°15; 
69 consecutive aminoacids of SEQ ID N°16; 
130 consecutive aminoacids of SEQ ID N°17; 

33 consecutive aminoacids of SEQ ID N°18; 
55 - 25 consecutive aminoacids of SEQ ID N°19; 

40 consecutive aminoacids of SEQ ID N°20; 
78 consecutive aminoacids of SEQ ID N°21 ; 
39 consecutive aminoacids of SEQ ID N°22; 
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10 



15 



20 



57 consecutive aminoacids of SEQ ID N°23; 
26 consecutive aminoacids of SEQ ID N°24; 

68 consecutive aminoacids of SEQ ID N°25; 
34 consecutive aminoacids of SEQ ID N°26; 
42 consecutive aminoacids of SEQ ID N°27; 

48 consecutive aminoacids of SEQ ID N°28. 
102 consecutive aminoacids of SEQ ID N°29: 

49 consecutive aminoacids of SEQ ID N°30: 
92 consecutive aminoacids of SEQ ID N° 31; 
49 consecutive aminoacids of SEQ ID N°30; 
92 consecutive aminoacids of SEQ ID N°31; 
71 consecutive aminoacids of SEQ ID N°32; 
55 consecutive aminoacids of SEQ ID N°33; 

69 consecutive aminoacids of SEQ ID N°34; 
23 consecutive aminoacids of SEQ ID N°35; 
33 consecutive aminoacids of SEQ ID N°36; 
32 consecutive aminoacids of SEQ ID N°37; 

and 

22 consecutive aminoacids of SEQ ID N°38. 



10. A polypeptide according to claim 7 having an amino acid sequence comprising from one to three substitutions, 
additions or deletions of one amino acid as regards a polypeptide selected from the group consisting of the amino 

25 acid sequences SEQ ID N°1 to 38 or as regards a polypeptide according to claim 9. 

11. An antibody directed against a polypeptide according to any one of claims 7 to 10. 

12. A recombinant vector containing inserted therein a nucleic acid according to any one of claims 1 to 5. 

30 

13. The recombinant vector according to claim 12 which is selected from the group consisting of the plasmids 
pACTIIst and pAS2AA. 

14. The recombinant vector according to claim 12 which is selected from the group consisting of pT25, pKT25, 
35 pUT18and pUT18C. 

15. The recombinant vector according to claim 12 which is selected from the group consisting of pP6 and pB5. 

16. A cell host transformed with a vector according to any one of claims 12 to 15 or with a nucleic acid according 
40 to anyone of claims 1 to 5. 

17. A method for producing a polypeptide according to any one of claims 7 to 10, wherein said method comprises 
the step of : 

45 a) cultivating a cell host according to claim 18 in an appropriate culture medium; 

b) recovering the recombinant polypeptide from the culture supernatant or from the cell lysate. 

18. A yeast two-hybrid system method for selecting a recombinant cell clone containing a vector comprising a 
nucleic acid insert encoding a prey polypeptide which binds with a SID® polypeptide, wherein said method com- 

50 prises the steps of : 

a) mating at least one first recombinant yeast cell clone of a collection of recombinant yeast cell clones trans- 
formed with a plasmid containing the prey polynucleotide to be assayed with a second haploid recombinant 
Saccharomyces cerevisiae cell clone transformed with a plasmid containing a bait polynucleotide encoding a 

55 SID® polypeptide according to any one of claims 7 to 10; 

b) cultivating diploid cells obtained in step a) on a selective medium; and 

c) selecting recombinant cell clones which grow on said selective medium. 
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19. The yeast two-hybrid method of claim 18 which further comprises the step of : 

d) characterizing the prey polynucleotide contained in each recombinant cell clone selected in step c). 

5 20. A bacterial two-hybrid method for identifying a recombinant cell clone containing a prey polynucleotide encoding 

a prey polypeptide which binds with a SID® polypeptide, wherein said method comprises the steps of: 

a) transforming bacterial cell clones with a plasmid containing a SID® polynucleotide encoding a SID® polypep- 
tide according to any one of claims 7 to 10; 
10 b) rescuing prey plasmids containing prey polynucleotides wherein each prey polynucleotide is a DNA fragment 

from the genome of a desired organism and wherein each prey plasmid is contained in one recombinant yeast 
cell clone of a collection of recombinant yeast cell clones; 

c) transforming the recombinant bacterial cell clones obtained in step a) with the plasmids rescued in step b); 

d) cultivating bacterial recombinant cells obtained in step c) on a selective medium; and 
15 e) selecting recombinant cell clones which grow on said selective medium. 

21 . The bacterial two-hybrid method of claim 20, wherein said method further comprises the step of f) character- 
ising the prey polynucleotide contained in each recombinant cell clone selected at step e). 

20 22. The method according to any one of claims 18 to 21, wherein the prey polypeptide is a human polypeptide. 

23. The method according to any one of claims 18 to 21, wherein the prey polypeptide is an HCV polypeptide. 

24. The method of claim 23, wherein the prey polypeptide is encoded by a strain of HCV which is pathogenic for 
25 human. 

25. A set of two nucleic acids consisting of: 

i) a first nucleic acid encoding a SID® polypeptide according to any one of claims 7 to 10; and 
30 ii) a second nucleic acid encoding a prey polypeptide which binds specifically with the SID® polypeptide defined 

in i). 

26. A set of two nucleic acids which is selected from the group consisting of the following sets: 

SEQ ID N°77/SEQ ID N°1; SEQ ID N°78/SEQ ID N°2; SEQ ID N°78/SEQ ID N°3; SEQ ID N°79/SEQ ID 
35 N°4; SEQ ID N°80/SEQ ID N°5; SEQ ID N°81/SEQ ID N°6; SEQ ID N°82/SEQ ID N°7; SEQ ID N°83/SEQ ID N°8; 

SEQ ID N°84/SEQ ID N°9; SEQ ID N°85/SEQ ID N°10; SEQ ID N°86/SEQ ID N°11; SEQ ID N°87/SEQ ID N°12; 

SEQ ID N°88/SEQ ID N°13; SEQ ID N°89/SEQ ID N°14; SEQ ID N°90/SEQ ID N°15; SEQ ID N°91/SEQ ID N°16; 

SEQ ID N°92/SEQ ID N°17; SEQ ID N°93/SEQ ID N°18; SEQ ID N°94/SEQ ID N°19; SEQ ID N°95/SEQ ID N°20; 

SEQ ID N°96/SEQ ID N°21; SEQ ID N°97/SEQ ID N°22; SEQ ID N°98/SEQ ID N°23; SEQ ID N°99/SEQ ID N°24; 
40 SEQ ID N°100/SEQ ID N°25. SEQ ID N°101/SEQ ID N°26. SEQ ID N°102/SEQ ID N°27; SEQ ID N°103/SEQ ID 

N°28. SEQ ID N°104/SEQ ID N°29; SEQ ID N°105/SEQ ID N°30; SEQ ID N°106/SEQ ID N°31 ; SEQ ID N°107/SEQ 

ID N°32; SEQ ID N°108/SEQ ID N°33; SEQ ID N°109/SEQ ID N°34; SEQ ID N°110/SEQ ID N°35; SEQ ID 

N°111/SEQ ID N°36; SEQ ID N°112/SEQ ID N°37; and SEQ ID N°113/SEQ ID N°38. 

45 27. A set of two polypeptides consisting of : 

i) a first polypeptide consisting of a SID® polypeptide according to any one of claims 7 to 10; and 

ii) a second polypeptide, also termed prey polypeptide, which binds specifically with the first polypeptide. 

50 28. A set of two polypeptides which is selected from the group consisting of the following sets: 

SEQ ID N°114/SEQ ID N°39; SEQ ID N°115/SEQ ID N°40; SEQ ID N°115/SEQ ID N°41; SEQ ID N°116/SEQ 
ID N°42; SEQ ID N°117/SEQ ID N°43; SEQ ID N°118/SEQ ID N°44; SEQ ID N°119/SEQ ID N°45; SEQ ID 
N°120/SEQ ID N°46; SEQ ID N°121/SEQ ID N°47; SEQ ID N°122/SEQ ID N°48; SEQ ID N°123/SEQ ID N°49; 
SEQ ID N°124/SEQ ID N°50; SEQ ID N°125/SEQ ID N°51; SEQ ID N°126/SEQ ID N°52; SEQ ID N°127/SEQ ID 

55 N°53; SEQ ID N°128/SEQ ID N°54; SEQ ID N°129/SEQ ID N°55; SEQ ID N°130/SEQ ID N°56; SEQ ID N°131/SEQ 

ID N°57; SEQ ID N°132/SEQ ID N°58; SEQ ID N°133/SEQ ID N°59; SEQ ID N°134/SEQ ID N°60; SEQ ID 
N°135/SEQ ID N°61; SEQ ID N°136/SEQ ID N°62; SEQ ID N°137/SEQ ID N°63; SEQ ID N°138/SEQ ID N°64; 
SEQ ID N°139/SEQ ID N°65; SEQ ID N°140/SEQ ID N°66; SEQ ID N°141/SEQ ID N°67; SEQ ID N°142/SEQ ID 
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N°68; SEQ ID N°143/SEQ ID N°69; SEQ ID N°144/SEQ ID N°70. SEQ ID N°145/SEQ ID N°71 ; SEQ ID N°146/SEQ 
ID N°72. SEQ ID N°147/SEQ ID N°73; SEQ ID N°148/SEQ ID N°74; SEQ ID N°149/SEQ ID N°75 and SEQ ID 
N°150/SEQ ID N°76. 

5 296. A complex formed between the two polypeptides of claim 29 or 30. 

30. A method for selecting a molecule which inhibits the binding between a set of two polypeptides according to 
claim 27 or 28, wherein said method comprises the steps of : 

w a) cultivating a recombinant host cell containing a reporter gene the expression of which is toxic for said 

recombinant host cell , said host cell being transformed with two vectors wherein : 

i) the first vector contains a nucleic acid comprising a polynucleotide encoding a first hybrid polypeptide 
containing one of said two polypeptides and a DNA binding domain; 
15 ii) the second vector contains a nucleic acid comprising a polynucleotide encoding a second hybrid 

polypeptide containing the second of said two polypeptides and an activating domain capable of activating 
said toxic reporter gene when the first and the second hybrid polypeptides are interacting; 
on a selective medium containing the molecule to be tested and allowing the growth of said recombinant 
host cell when the toxic reporter gene is not activated; and 

20 

b) selecting the molecule which inhibits the growth of the recombinant host cell defined in step a). 

31. A method for selecting a molecule which inhibits the protein-protein interaction of a set of two polypeptides 
according to claim 29 or 30, wherein said method comprises the steps of: 

25 

a) cultivating a recombinant host cell containing a reporter gene the expression of which is toxic for said 
recombinant host cell, said host cell being transformed with two vectors wherein : 

i) the first vector contains a nucleic acid comprising a polynucleotide encoding a first hybrid polypeptide 
30 containing one of said set of two polypeptides and the first domain of an enzyme; 

ii) the second vector contains a nucleic acid comprising a polynucleotide encoding a second hybrid 
polypeptide containing the second of said two polypeptides and the second part of said enzyme capable 
of activating said toxic reporter gene when the first and the second hybrid polypeptides are interacting, 
said interaction recovering the catalytic activity of the enzyme; 

35 on a selective medium containing the molecule to be tested and allowing the growth of said recombinant 

host cell when the toxic gene is not activated; and 

b) selecting the molecule which inhibits the growth of the recombinant host cell defined in step a). 

40 32. A kit for the screening of a molecule which inhibits the protein-protein interaction of a set of two polypeptides 

according to claim 27 or 28, wherein said kit comprises a recombinant cell host containing a reporter gene the 
expression of which is toxic for said recombinant cell host, said cell host being transformed with two vectors 
wherein : 

45 j) the first vector contains a nucleic acid comprising a polynucleotide encoding a first hybrid polypeptide con- 

taining one of said two polypeptides and a DNA binding domain; 

ii) the second vector contains a nucleic acid comprising a polynucleotide encoding a second hybrid polypeptide 
containing the second of said two polypeptides and an activating domain capable of activating said toxic re- 
porter gene when the first and the second hybrid polypeptides are interacting 
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33. A kit for the screening of a molecule which inhibits the protein-protein interaction of a set of two polypeptides 
according to claim 27 or 28, wherein said kit comprises a recombinant host cell containing a reporter gene the 
expression of which is toxic for said recombinant host cell, said host cell being transformed with two piasmids 
wherein : 

i) the first vector contains a nucleic acid comprising a polynucleotide encoding a first hybrid polypeptide con- 
taining one of said two polypeptides and the first domain of an enzyme; 

ii) the second plasmid contains a nucleic acid comprising a polynucleotide encoding a second hybrid polypep- 
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tide containing the second of said two polypeptides and the second part of said enzyme capable of activating 
said toxic reporter gene when the first and the second hybrid polypeptides are interacting, said interaction 
recovering the catalytic activity of the enzyme. 

34. A marker compound wherein said compound comprises : 

a) a Selected Interacting Domain (SID®) polypeptide according to any one of claims 7 to 10 or a variant thereof; 
and 

b) a detectable molecule bound thereto. 

35. The marker compound of claim 34, wherein the detectable molecule consists of a fluorescent protein. 

36. The marker compound of claim 35, wherein the detectable protein is selected from the group consisting of 
GFP and YFP. 

37. The marker compound of claim 35, wherein the detectable molecule is endowed with a catalytic activity. 

38. The marker compound of claim 37, wherein the detectable molecule is selected from the group consisting of 
a hydrolase, a transferase, a lyase, an isomerase, a ligase, a synthetase and a oxidoreductase. 

39. The marker compound of claim 34, wherein the detectable molecule is radioactive. 

40. The marker compound of claim 34, wherein the detectable molecule is chemiluminescent. 

41. The marker compound according to any one of claims 34 to 40, wherein the detectable molecule is covalently 
bound to the Selected Interacting Domain (SID®) polypeptide or a variant thereof. 

42. The marker compound according to any one of claims 34 to 40, wherein the detectable molecule is non cov- 
alently bound to the Selected Interacting Domain (SID®) polypeptide or a variant thereof. 

43. The marker compound of claim 42, wherein the detectable molecule is an antibody directed specifically against 
the Selected Interacting Domain (SID®) polypeptide. 

44. The marker compound of claim 43, wherein said antibody is labelled radioactively or non radioactively. 

45. The marker compound according to claim 34, wherein : 

a) the Selected Interacting Domain (SID®) polypeptide or a variant thereof is covalently bound to a first ligand.; 
and 

b) the detectable molecule comprises a second ligand which binds specifically to the first ligand. 

46. The marker compound according to claim 45, wherein the first ligand is biotin and the second ligand is strepta- 
vidin. 

47. A nucleic acid encoding a marker compound according to any one of claims 34 to 41. 

48. A nucleic acid encoding the Selected Interacting Domain (SID®) polypeptide or a variant thereof onto which 
is covalently bound a first ligand defined in claims 45 and 46. 

496. A recombinant vector comprising inserted therein a nucleic acid according to any one of claims 47 and 48. 

50. The recombinant vector according to claim 48, which is selected from the group consisting of pACTIlst, pASAA, 
pT25, pKT25, pUT18, pUT18C, pP6 and pB5.. 

51. A recombinant host cell which has been transfected with a nucleic acid according to any one of claims 47 and 
48 or a recombinant vector according to any one of claims 43 and 44. 

52. The recombinant host cell according to claim 51 which is of prokaryotic origin. 
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53. The recombinant host cell according to claim 51 which is of eukaryotic origin. 

54. The recombinant host cell according to claim 52 which is a mammalian host cell. 

5 53. A method for detecting a polypeptide of interest within a sample, which comprises the steps of : 

/ a) contacting a marker compound or a plurality of marker compounds according to any one of claims 34 to 46 

with the sample; 

b) detecting the complexes formed between said marker compound or said plurality of marker compounds 
10 and said polypeptide of interest. 

56. A kit for detecting a polypeptide of interest within a sample, which comprises a marker compound according 
to any one of claims 34 to 46. 

15 57. A method for detecting a polypeptide of interest within a prokaryotic or an eukaryotic host cell, said method 

comprising the steps of : 

a) providing a cell host to be assayed; 

b) transfecting said host cell with a nucleic acid according to any one of claims 41 and 42 or with a recombinant 
20 vector according to any one of claims 496 and 50; 

c) detecting the complexes formed between the marker compound expressed by the transfected cell host and 
the polypeptide of interest. 

58. A kit for detecting a polypeptide of interest within a prokaryotic or an eukaryotic host cell which comprises a 
25 nucleic acid according to any one of claims 47 and 48 or a recombinant vector according to any one of claims 496 

and 50. 

59. A method for detecting a polypeptide of interest within a prokaryotic or eukaryotic host cell, said method com- 
prising the steps of : 

30 

a) providing a cell host to be assayed; 

b) introducing a marker compound according to any one of claims 36 to 48 within said cell host; 

c) detecting the complexes formed between the marker compound and the polypeptide of interest within the 
cell. 

35 

60. A kit for detecting a polypeptide of interest within a prokaryotic or eukaryotic host cell comprising a marker 
compound according to any one of claims 34 to 46. 

61. A method for detecting a polypeptide or a plurality of polypeptides of interest within a sample, wherein said 
40 method comprises the steps of : 

a) providing a substrate onto which a Selected Interacting Domain (SID®) polypeptide according to any one 
of claims 7 to 10 or a variant thereof, or a plurality of Selected Interacting Domain (SID®) polypeptides ac- 
cording to any one of claims 7 to 10 or variants thereof is (are) immobilised; 
45 b) bringing into contact the substrate defined in a) with the sample to be assayed; 

c) detecting the complexes formed between the Selected Interacting Domain (SID®) polypeptides or variants 
thereof or a variant thereof, or the plurality of Selected Interacting Domain (SID®) polypeptide and a molecule 
or a plurality of molecules initially contained in the sample; 

50 62. The method of claim 61, wherein a plurality of Selected Interacting Domain (SID®) polypeptides or variants 

thereof are immobilised on the substrate in an ordered manner. 

63. The method of claim 61 , wherein the Selected Interacting Domain (SID®) polypeptide or a variant thereof, or 
the plurality of Selected Interacting Domain (SID®) polypeptides or variants thereof are covalently bound to the 

55 substrate. 

64. The method of claim 61 , wherein the Selected Interacting Domain (SID®) polypeptide or a variant thereof, or 
the plurality of Selected Interacting Domain (SID®) polypeptides or a variant thereof are non-covalently bound to 
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the substrate. 

65. The method of claim 61, wherein the Selected Interacting Domain (SID®) polypeptide or a variant thereof, or 
the plurality of Selected Interacting Domain (SID®) polypeptides or variants thereof are covalently bound to a first 
ligand and wherein the substrate is coated with a second ligand which specifically binds to the first ligand. 

66. The method of claim 61, wherein the first ligand is biotin and the second ligand is streptavidin. 

67. The method according to anyone of claims 61 to 66, wherein the Selected Interacting Domain (SID®) polypep- 
tide or a variant thereof, or the plurality of Selected Interacting Domain (SID®) polypeptides or variants thereof 
are covalently linked to a spacer and wherein said spacer is covalently bound to the substrate in order to immobilise 
the Selected Interacting Domain (SID®) polypeptide, or a variant thereof or the plurality of Selected Interacting 
Domain (SID®) polypeptides. 

68. The method according to any one of claims 61 to 66 wherein the detection step c) consists of detecting changes 
in optical characteristics of the substrate. 

69. A device for the detection of a polypeptide or a plurality of polypeptides of interest within a sample, wherein 
said device comprises a substrate onto which a Selected Interacting Domain (SID®) polypeptide according to any 
one of claims 7 to 10 or a variant thereof or a plurality of Selected Interacting Domain (SID®) polypeptides according 
to any one of claims 7 to 10 or variants thereof is (are) immobilised. 

70. A pharmaceutical composition comprising a pharmaceutical^ effective amount of a nucieic acid comprising a 
polynucleotide encoding a Selected Interacting Domain (SID®) polypeptide according to any one of claims 7 to 
10 or a variant thereof. 

71. A method for preventing or curing a viral infection by a Hepatitis C virus in a human or an animal, wherein said 
method comprises a step of administering to the human or animal body a pharmaceutical^ effective amount of a 
Selected Interacting Domain (SID®) polypeptide according to any ine of claims 7 to 10. 

72. A method for preventing or curing a viral infection by a Hepatitis C virus in a human or an animal, wherein said 
method comprises a step of administering to the human or animal body a pharmaceutical^ effective amount of a 
nucleic acid comprising a polynucleotide encoding a Selected Interacting Domain (SID®) polypeptide according 
to any one of claims 7 to 10, and wherein said polynucleotide is placed under the control of regulatory sequence 
which is functional in said human or said animal. 

73. A method for preventing or curing a viral or a bacterial infection in a human or an animal, wherein said method 
comprises a step of administering to the human or animal body a pharmaceutical^ effective amount of a recom- 
binant expression vector comprising a polynucleotide encoding a Selected Interacting Domain (SID®) polypeptide 
according to any one of claims 7 to 10. 

74. A method for selecting a SID® polypeptide comprising the steps of: 

1) Selecting a collection of nucleic acids (prey nucleic acids) which bind specifically to a given bait polypeptide 
of interest; and 

2) determining the nucleic acid sequences which encode for SID® polypeptides after having generated sets 
of polynucleotides from the collection of nucleic acids selected at step 1 ). 

75. The method of claim 74, wherein step 1 ) consists of a yeast two-hybrid method or a bacterial two-hybrid method. 

76. The method of claim 74, wherein step 2) comprises the following steps of : 

a) selecting from the collection of prey polynucleotides obtained at the end of step 1 ) all prey polynucleotides 
encoding a prey polypeptide capable of interacting with said bait polypeptide and containing a common nucleic 
acid fragment; 

b) aligning the nucleotide sequences of the prey polynucleotides selected at step a) and gathering in one set 
or in a plurality of sets of sequences those nucleotide sequences which have sequences that overlap for more 
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than 30% of their respective nucleic acid length, wherein each common overlapping nucleotide sequence in 
one set of sequences defines a sequence encoding a pre-SID® polypeptide ; and 
c) aligning two sequences encoding two respective pre-SID® polypeptides, and: 

5 (i) defining an overlapping nucleic acid sequence between the sequences encoding the two respective 

pre-SID® polypeptides as a sequence encoding a SID® polypeptide, provided that the overlapping se- 
quence is of at least 30 nucleotides in length; 

ii) defining a non-overlapping nucleic acid sequence between the sequences encoding the two respective 
pre-SID® polypeptides as a sequence encoding a SID® polypeptide, provided that (1) said non-overlap- 
10 ping sequence has more than 30 nucleotides in length and (2) said non-overlapping sequence represents 

at least 30% in length of any one of the polynucleotides contained in the set of prey polynucleotides used 
for defining the sequence encoding each pre-SID® polypeptide. 



15 



77. The method of claim 76 wherein step 2) further comprises the steps of : 



d) counting the number of overlapping prey polynucleotides contained in a first set of polynucleotides defining 
a sequence encoding a first SID® polypeptide; 

e) counting the number of overlapping prey polynucleotides contained in a second set of polynucleotides 
defining a sequence encoding a second SID® polypeptide which overlaps with the sequence encoding the 

20 first SID® polypeptide; 

f) determining which sequence among those encoding respectively the first SID® polypeptide and the second 
SID® polypeptide has been defined with the largest number of prey polynucleotides and selecting this set of 
prey sequences; 

g) adding to the set of prey sequences selected at step f) those sequences that were contained in the set of 
25 prey sequences used for defining the sequence encoding the SID® polypeptide with the smallest number of 

prey sequences and which overlap with the sequence encoding the SID® polypeptide with the largest number 
of prey sequences; 

h) aligning the prey sequences added at step g) with the sequences already contained in the set of prey 
sequences which defined the sequence encoding the SID® polypeptide with the largest number of prey se- 

30 quences; 

i) defining an overlapping sequence between the whole sequences which were aligned in step h), wherein 
said overlapping sequence consists of a sequence encoding a SID® polypeptide. 

78. The method according to any one of claims 74 to 76, wherein the collection of prey nucleic acids is prepared 
35 starting from the genomic DNA of an organism containing contiguous Open Reading Frames. 

79. The method according to claim 78, wherein said organism is a virus. 

80. The method according to claim 79, wherein the virus consists of the Hepatitis C virus. 

40 

81 . The method according to claim 80, wherein the Hepatitis C virus is pathogenic for a mammal, including human. 

82. A SID® nucleic acid selected according to the method of any one of claims 74 to 81. 
45 83. A SID® polypeptide encoded by a nucleic acid according to claim 82. 
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Although claims 71 to 73 are directed to a method of treatment of the 
human/animal body (Article 52(4) EPC), the search has been carried out 
and based on the alleged effects of the compound/composition. 



Claim(s) searched incompletely: 
26, 28 

Reason for the limitation of the search: 

The numerotation of the pairing sets of respectively nucleic acids and 
polypeptides given in the claims 26 and 28 are only corresponding for the 
six first sets to the data of the description given in table 1. Therefore 
a meaningful exhaustive formulation of the inventions 7 to 38 was not 
possible. It could be supposed that the number 7 is missing in the last 
col iinn of this table 1, leading to a last aminoacid sequence SEQ ID 39 
corresponding in fact to the nucleic acid sequence encoding SIO peptide 
SEQ ID NO 1 following the sequence listing. 



The right numerotation of claim 296 seemed obviously been claim 29 and 
the search report and supplemental sheet B were written taking into 
account this correction. 

Comparatively, claims 57 and 58 were read as referring to claim 49 
instead of the oviuosly wrong 496. 
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The present European patent application comprised at the time of filing more than ten claims. 

□ Only pari of the claims have been paid within the prescribed time limit The present European search 
report has been drawn up for the first ten claims and for those daims tor which claims fees have 
been paid, namely daim(9): 




European Patent 
Office 



□ No claims fees have been paid wfthln the prescribed time limit. The present European search report has 
been drawn up for the first ten claims. 



LACK OF UNITY OF INVENTION 



The Search Division considers that the present European patent application does not comply with the 
requirements of unity of Invention and relates to several Inventions or groups of Inventions, namely: 



see sheet B 



□ 
□ 
□ 



All further search fees have been paid within the fixed time limit The present European search report has 
been drawn up for all claims. 

As ail searchable claims could be searched without effort justifying an additional fee, the Search Division 
did not invite payment of any additional fee. 

Only part of the further search fees have been paid within the fixed time limit. The present European 
search report has been drawn up for those parts of the European patent application which relate to the 
inventions In respect of which search fees have been paid, namely claims: 



None of the further search fees have been paid within the fixed time limit The present European search 
report has been drawn up for those parts of the European patent application which relate to the invention 
first mentioned in the claims, namely claims: 

see sheet B invention group 1. 
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The Search Division considers that the present European patent application does not comply with the 
requirements of unity of invention and rafales to several inventions or groups of inventions, namely: 

1. Claims: Invention 1 : Partially 1-83 

Invention 1 : 

A nucleic acid which encodes a so named SID polypeptide of 
amino acid sequence SEQ ID No 1 or a variant thereof, or a 
fragment of at least 45 aminoacids, or consisting of the 
sequence SEQ ID No 39; a SID polypeptide of the amino acid 
sequence SEQ ID No 1 or a variant thereof » or a fragment of 
at least 45 anrinoacids; an antibody directed against said 
polypeptide; a recombinant vector containing such a nucleic 
acid; a cell host transformed with said vector; a method of 
producing said polypeptide by cultivating said cell host; a 
yeast two-hybrid system method for selecting a recombinant 
cell clone containing a vector conprising a nucleic acid 
insert encoding a prey polypeptide which binds with said SID 
polypeptide ; a bacterial two-hybrid system method for 
identifying a recombinant cell clone containing a prey 
polynucleotide encoding a prey polypeptide which binds with 
said SID polypeptide ; a set of two nucleic acids consisting 
of i) a first nucleic acid encoding the SID polypeptide of 
SEQ ID No 1 or a variant or a fragment of at least 45 amino 
acids thereof ii) a second nucleic acid encoding a prey 
polypeptide which binds specifically to said SID 
polypeptide, particularly SEQ ID No 77; a set of two 
polypeptides consisting of i) said SID polypeptide of SEQ ID 
No 1 or a variant or a fragment of at least 45 amino acids 
thereof ii) a second prey polypeptide which binds 
specifically to said SID polypeptide, particularly SEQ ID 
No 114; a method for selecting a molecule which inhibits the 
binding between said set of two polypeptides; a kit for the 
screening of a molecule which inhibits the protein-protein 
interaction of said set of two polypeptides; a marker 
compound comprising a) said SID polypeptide and b) a 
detectable molecule bound thereto; a nucleic acid encoding 
said marker compound, a vector or a host cell comprising it; 
a method, a kit or a device for detecting at least a 
polypeptide of interest using said marker compound; a 
pharmaceutical composition conprising a nucleic acid 
comprising a polynucleotide encoding said SID polypeptide; a 
method for preventing or curing an infection using said 
nucleic acid; a method for selecting a SID polypeptide 
comprising the steps of 1) selecting a collection of 
nucleic acids (prey nucleic acids) which binds specifically 
to a given bait polypeptide of interest; and 2) determining 
the nucleic acid sequences which encode said SID 
polypeptides. 



2. Claims: Inventions 2 to 38 : Partially 1-83 
Inventions 2 to 38 : 
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